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Brief Reports 


The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because of lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 


1. Sends the Brief Report, limited to one printed 
page and prepared according to the specifications 
given below. 

2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 

3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send without 
charge to all who request it as long as the supply 
lasts. 


4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 


Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. 

To insure that the Brief Report will be no 
longer than one printed page, its typescript, 


including all matter except the title and the 
author’s lines, must not exceed 80 lines av- 
eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style of 
the 1957 revision of the APA Publication 
Manual. Headings, tables, and references are 
avoided or, if essential, must be counted in 
the 80 lines. Each Brief Report must be ac- 
companied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 80-line quota: * 


1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No 
mitting $ for microfilm or $— 
copies 


——, re- 
for photo- 


Extended report. Because the extended re- 
port is intended for photoduplication, and is 
not copy to be sent to a printer, its style 
should differ in several ways from that of 
other manuscripts: (a) The extended report 
should be typed with single spacing for 
economy in duplication. (6) Tables and fig- 
ures should be placed adjacent to the text 
which refers to them. A caption should be 
typed below each figure. (c) Footnotes should 
be typed at the bottom of the page on which 
reference is made to them. In other respects, 
the full report is prepared in the style speci- 
fied by the Publication Manual. 





EDITORIAL 


With this issue, the Journal of Consulting Psy- 
chology has become the responsibility of a new 
Editor, punctuating 12 years of guidance by 
Laurance F. Shaffer. I do not plan to mark this 
occasion by any extended review of that signifi- 
cant period nor by an elaborate statement of a 
prospectus. The retiring Editor plans a report of 
his stewardship, and to him naturally falls the 
responsibilities and well earned satisfactions such 
a review will offer. 

An editor of a well established journal, as this 
one is, finds little scope for radical invention. 
The American Psychological Association, through 
its Publications Board, defines its general sphere 
as “the clinical journal.” As such, its contents 
include original research relevant to psychologi- 
cal diagnosis, psychotherapy and counseling, per- 
sonality, and the dynamics of behavior. Although 
quantitative studies have been given priority, 
relevant theoretical contributions, case studies, 
and descriptions of clinical techniques are also 
acceptable. 

Since readers and authors are a vastly over- 
lapping group, I am using this opportunity of 
catching their attention to emphasize the differ- 
ence between the official definition of the Jour- 
nal’s scope and what is defined by the articles 
that appear between its covers. The primacy of 
original research is easily evident even to the 
casual reader. The retiring Editor, with the help 
of other research minded clinical psychologists, 
has removed any lasting doubt that the subject 
matter of clinical psychology is susceptible to 
thoroughgoing empirical treatment. From here 
on, the image of the journal must be sketched in 
largely imaginary lines. Theoretical contributions 
occur at the rate of slightly more than one a 
year. Case studies and descriptions of clinical 
techniques are virtually unrepresented. 

Why the discrepancy between the official and 
the functioning images? Some may suggest that 
it reflects the Editor’s biases. Experience with my 
first 200 submitted manuscripts makes me doubt 
the validity of that explanation. In that sample, 
there were the usual one or two theoretical con- 
tributions that were worthy enough to be ac- 
cepted for publication. The two case studies of- 
fered were done without clear focus as to their 
purpose or with conviction that they could be 
scholarly contributions. 


I refuse to believe that the potential for a 
more rounded contribution to the literature on 
clinical psychology does not exist. I affirm my 
faith that we will not endanger our newly won 
rigor and empiricism when we give our attention 
and the Journal’s pages to imaginative flights be- 
yond the fully charted regions, to speculative ex- 
cavations that will open up new mines of data 
and to the vivid presentation of clinical experi- 
ences from which research in clinical psychology 
above all else, should draw its nourishment 

There ought to be more evidence of the fer- 
ment in our field. We need reviews that force all 
of us to pause in our mad chase for data to con- 
template what the data at hand have to offer 
When so many studies of manifest anxiety have 
been completed, it is time for someone to ex- 
amine the question of what we have learned and 
of which are the most profitable new directions 
to pursue. Incidentally, too many manuscripts 
hurry to tell what data have been collected, how 
obtained, and the results of their analysis—be- 
fore taking any trouble to show the reader why 
he should care what answer is forthcoming. I 
have heard the opinion expressed that the Jour- 
nal wants short papers; authors are expected to 
excise all theory. This is not to be our policy 
Similarly, even this brief experience has made 
me acutely conscious of how much fragmenting 
of research reports exists. Within one volume, or 
distributed over two, a man will publish two or 
more interrelated studies that would best be re- 
ported as one. I shall act on the conviction that 
one well rounded report incorporating a series of 
related studies is worth more than the sum of the 
piecemeal reports of the parts of the series 

As I write, I find that a fighting mood rises. I 
do think we need some controversy, but without 
rancor. In constructive fashion, we need to re- 
view each other’s work. We sorely need critical 
reviews of specific tests or clinical techniques, or 
of ways of reaching specific types of inferences 
of interest to the clinician. The data regarding 
many of our most used clinical instruments, es- 
pecially the Rorschach and Minnesota Multi- 
phasic, are so vast and so heterogeneous that 
they defy review as a whole. It seems to me that 
there can be no meaningful answer to the ques- 
tion of whether either instrument is valid. It is 
more meaningful to take a particular type of in- 
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ference that is drawn by use of the instrument, 
examine the logical structure on which it rests, 
and review the data which will show how tenable 
that logical structure is. 

Finally, there is a need for communication and 
mutual stimulation that can lead to the evolution 
of new clinical procedures. We need discussions 
and descriptions on such topics as patterns of 
group practice, intake practices, or problems of 
supervision—to mention but a few. 

Some may fear that I am inviting rambling, 


indistinctly formed, free associative contributions 
such as might come from a diary of a clinical 
psychologist. This is far from my purpose. The 
Journal of Consulting Psychology will accept 
only those contributions that show evidence that 
the creative spark has been fanned into flame 
through the application of a rigorous scholarship 
which places the contribution in its theoretical 
and historical context. 
EpWARD S. BorDIN 
Editor 
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A CONCEPTUAL MODEL OF PROJECTIVE TECH- 
NIQUES APPLIED TO STIMULUS VARIATIONS 
WITH THEMATIC TECHNIQUES 


BERNARD I 


MURSTEIN 


University of Portland 


The clinical psychologist today would hardly 
accept without reservation the view that the 
projective techniques necessarily evoke re- 
sponses tapping the “. . . private world of 
the individual, comprising as it does, the feel- 
ings, urges, beliefs, attitudes, and desires of 
which he may be dimly aware and which he 
is often reluctant to admit even to himself, 
much less to others . . .” (Frank, 1948, p. 
66). Neither would he, in most cases, leap 
to second the view that “The interpretative 
methods [TAT] offer the subject a situation 
or action to which he responds by a creative 
activity, wherein is disclosed his basic con- 
cepts, expectations, feelings” (Frank, 1948, 
p. 57). 

Suca views have not sufficiently emphasized 
the importance of the stimulus properties of 
the various techniques as well as such back- 
ground characteristics as examiner-examinee 
relationship in determining the response 
elicited. What is currently needed is a the- 
ory to account for the interaction of the vari- 
ous determinants of responses to projective 
techniques. In this regard, the theory of 
adaptation-level of Helson (1955) seems most 
helpful. The theory briefly stated would hold 
that: 


Operationally the adaptation level is represented 
by the stimulus to which the organism responds, 
either not at all or in an indifferent or neutral man- 
ner. In a large variety of situations the AL proves 
to be a weighted log mean of three classes of stimuli: 
the stimulus in the focus of attention; all stimuli in 
the field forming the context or background; and 
residuals from past experience. These three classes of 
stimuli pool or interact to determine the AL and, 
hence, the adjustment of the organism. As stimula- 
tion or behavior varies, the adaptation level fluctu- 
ates accordingly. A simple formula expresses these 


facts more clearly and concisely as follows: 
log A= p log X + q log B+r log R 


where p, q, and r are weighting factors showing the 
relative importance of X—the stimulus, B—the back- 
ground, and R—the residual (pp. 91-92) 


In adapting this approach to the analysis 
of responses to projective techniques, it may 
be useful to spell out each of the three cate- 
gories in some detail. 


Stimulus Properties of Tests 


Under this heading may be considered the 
varying degrees of ambiguity characteristic of 
the different techniques. Thus, the Draw-A- 
Person has no structure aside from the limita- 
tions ensuing from the instructions to “draw 
a whole person,” and the use of an 8 xX 11 
sheet of white paper and a pencil. The Ror- 
schach has a series of blots selected on the 
basis of certain shading, color, and structural 
qualities. The TAT, which is more hetero- 
geneous in its make-up, contains some cards 
in which the characters are difficult to dis- 
tinguish and other cards in which the char- 
acters involved are easily distinguished, but 
the feelings they are experiencing or actions 
engaged in are not readily apparent. 


Background Characteristics 


Here one deals with the environment both 
psychological and physical in which the test 
is given. Questions of concern are the purpose 
of the testing from the point of view of the 
subject and from the point of view of the 
examiner. Is the test given to obtain norma- 
tive data from college sophomores, or to de- 
termine whether a “model prisoner” is to be 
paroled? Other variables influencing the re- 
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sponse will be the examiner’s sex, age, attrac- 
tiveness, and interest in the subject. 


Personality Characteristics 


Within the construct of personality one 
deals with the subcategories: (@) organismic 
needs and (5) learned needs. Examples of or- 
ganismic needs are hunger, sex drive, thirst, 
and fatigue. Learned needs have to do with 
cultural values and ego needs. The two, of 
course, may well interact in that one might 
drink a cup of tea (organismic need) while 
holding the cup and saucer in a culturally 
prescribed manner (learned need). 

The perception of such needs on a projec- 
tive technique is a function (a) of the in- 
tensity of the need and (4) the expectancy of 
satisfying that need. These factors have been 
pointed out by Rotter (1954). The impor- 
tance of Factor 5 in the manifestation of the 
projected need may be illustrated through ref- 
erence to the work of Brozek, Guetzkow, and 
Baldwin (1951). Using males subjected to a 
semistarvation diet for 24 weeks, the authors 
reported no significant increase in perception 
of food on the Rorschach. It should be ap- 
parent, therefore, that the presence of a need 
without the means of satiating it may inhibit 
the perception of objects relating to the need 
rather than sensitizing one to them. The work 
of Levine, Chein, and Murphy (1942) also 
seems to substantiate this conclusion. 

Lastly, the S himself interprets the stimu- 
lus, background and variables applying to the 
self and, either consciously or unconsciously, 
arrives at a decision as to what he will re- 
port to the examiner. The end response may 
thus be viewed as composed of two compo- 
nents, that which is within the control of the 
perceiver and that which is beyond his con- 
trol (i.e., where the needs are overt and too 
strong to control, or are unacceptable and 
covert as in unconscious needs). 

It would seem, then, that in addition to 
underestimating the importance of the stimu- 
lus and background factors, many clinicians 
have underestimated the strength of the con- 
trol which many Ss exercise over the re- 
sponses they manifest. It has been assumed 
that the S must reveal his “private world.” 
If he does so consciously there is no difficulty 
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save that the same information might have 
been obtained more economically (both from 
the point of view of time as well as money) 
through a brief interview. If, on the other 
hand, the S’s defenses prevent the free ex- 
pression of his problem, it has been assumed 
that the character of his defenses may be 
analyzed in order to get at the problem. In 
short, no response is held to be an “accident.” 
In its broadest sense this statement is doubt- 
lessly true. From an empirical standpoint, 
however, many responses are “error variance” 
or superfluous to the problem which the cli- 
nician seeks to answer. Thus, on the Ror- 
schach one might obtain a response to the 
white space of Card 2 of “A new type of jet 
plane. I saw it yesterday on television.” The 
response ostensibly may have little to add to 
the understanding of the S. But the clinician 
might think, “Why of all the possible objects 
he saw yesterday, did he happen to report a 
jet airplane?” The S might well state that he 
had seen other planes, ships, etc., but that 
this particular space seemed to rather closely 
resemble “the delta-swept wing jet.” Such an 
answer would rarely satisfy some clinicians. 
It is unfortunate that the effect of stimulus 
properties in determining a response is so 
thoroughly neglected in favor of wholly sub- 
ject-oriented interpretations. 

Another obstacle in the understanding of 
projective techniques has been the failure to 
accept the view that even with severely dis- 
turbed persons, some defenses may be impos- 
sible to pierce from an analysis of the data. 
The omission of response as occurs in a Ror- 
schach record totaling only five responses, is 
probably indicative of “something,” but that 
“something” will rarely be apparent from the 
analysis of the five responses. 

If we are to build a theory of projective 
techniques, much more data are needed with 
regard to the roles played by the stimulus, 
background, and personality variables. This 
paper undertakes an analysis of the effect of 
stimulus variation of TAT-type pictures to- 
ward the end of systemizing and interpreting 
some of the data already obtained. It will at- 
tempt to list the key studies in this area and 
to clarify the importance of the findings for 
projective test theory. 





Conceptual Model of Projective Techniques 5 


Stimulus Variations with TAT-Type Tests 


Stimulus variations have been used with re- 
gard to light control, length of time cards 
have been exposed, changes in the construc- 
tion of the drawings, and number of possible 
interpretations that might apply to each card. 
It is proposed to consider the work done in 
each of these areas to determine their effect 
upon the responses obtained. 


Variations in Lighting 


Weisskopf (1950a) had college students 
write descriptions of several TAT cards which 
were presented either in an unaltered form, 
or as pictures taken under reduced exposure, 
giving the pictures a hazy effect. Ne signifi- 
cant differences in “transcendence” (mean 
number ‘of responses going beyond pure 
description) were obtained. More recently, 
Bradley and Lysaker (1957), using a life- 
like TAT type picture, varied the illumina- 
tion from normal to three successive darker 
stages of illumination, as well as three lighter 
ones. The Ss were several hundred house- 
wives (range 122 to 171 for each stage) with 
each S seeing the picture in only one stage. 
Similar to Weisskopf’s (1950a) finding, no 
difference in productivity of response was 
noted. In analyzing the content of associa- 
tions with regard to the picture, however, a 
clear positive linear relationship was found 
between increasing degree of darkness and 
pleasantness of association. When the picture 
was used with a varying degree of increasing 
light, the picture slightly lighter than the 
control elicited the most favorable associa- 
tion, followed by the next lighter picture, the 
lightest picture, and the control picture. The 
effect of light was apparently not a linear 
function, with a moderate amount of supra- 
normal illumination considered most pleas- 
ant, and either more or less illumination con- 
sidered less pleasant. 

It seems clear that with the absence of 
light, the stimulus properties of the picture 
became increasingly vague, thus resulting in 
an increasing “internal” perception on the 
part of the S. In the relative absence of sen- 
sory stimulation, a person finds it easier to 
introspect and to relate personal material 
with considerably less inhibition than under 


normal circumstances. The soothing yet fan- 
tasy enriching effect of darkness is what 
makes a motion picture more enjoyable in a 
“movie” house than under the semilighted 
atmosphere of a television room. Moreover, 
many psychotherapists have their seating ar- 
ranged so as to be out of the patients’ direct 
gaze on the assumption that there is less in- 
terference with the patients’ private thoughts. 
Here, too, the darkened room is often more 
likely to educe free association than is the 
normally lighted one. 

The supra-illumination effects are more diffi- 
cult to interpret. Possibly, a limited amount 
of extra light obscures the picture and per- 
mits pleasant fantasies to be evoked. An in- 
creasing amount of light may be irritating to 
the nervous system and accordingly elicit less 
pleasant associations. The relative lack of 
“pleasantness” occurring under normal illu- 
mination may have stemmed from the stimu- 
lus properties being most clear at this stage, 
and hence the subjects having been more 
likely to respect the neutral emotional tone 
of the picture (woman baking a cake) 

The question may arise as to why an asso- 
ciation elicited under less than optimal sen- 
sory conditions should be pleasant rather 
than unpleasant. Two factors may be sus- 
pected of playing a part here. In the first 
place, fantasy often occurs as a temporary 
vicarious means of satisfying needs and ten- 
sions which are as yet unsatisfied in the outer 
world. The teen-age girl dreams of a prince 
charming, the young executive imagines him- 
self as a junior vice-president. Many of our 
fantasies are thus related to a need to suc- 
ceed in a variety of activities, the attainment 
of which would be associated with pleasant 
feelings. It is, of course, true that unpleasant, 
even hostile, tensions may also be present. 
The young girl may be annoyed because a 
rival seems to have currently won her prince 
charming; the young executive may chafe 
under the imposition of being a “yes” man 
to the “boss.” Nevertheless, militating against 
the appearance of hostility on the TAT 
would be the fact that the S would probably 
deem the testing situation as inappropriate 
for the expression of hostility. The inhibitory 
effect of the “background” might be due to 
a weak generalization gradient from hostility 
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to the “boss” to hostility in an ambiguous 
situation, or the § might simply inhibit all 
expression of hostility because he felt he was 
being judged and he wanted to make a good 
impression. Since it is socially acceptable to 
perceive pleasant things and not as accept- 
able to perceive hostility in others, it is more 
than likely that when the stimulus proper- 
ties of the object being perceived are weak 
enough, the S will be apt to fantasize pub- 
licly only over a narrow band of ideation, 
namely, those topics which do not conflict 
with what the S perceives to be the ego de- 
mands of the social situation in which he is 
involved. 


Variations in Time Exposure 


Weisskopf (1950a) presented college Ss 
with 2 sets of TAT pictures exposed for .2 of 
a second and for 5 seconds. The latter group 
manifested a higher number of fantasy scores 
(“transcendence”). Kenny (1954) exposed 
the TAT pictures to college Ss for 5 seconds 
and 2 minutes, respectively. The correlation 
between “personality revealingness” of the 
themes elicited and “transcendence” was .62 
and .64 for the 5-second and 2-minute con- 
ditions of exposure. 

Apparently, the individual fantasizes with 
greater personal involvement as well as 
greater productivity when he has sufficient 
time to appraise the stimulus qualities of the 
card. After a sufficient time has elapsed 
(optimally, perhaps, a few seconds), added 
length of exposure has little effect. 


Variations in the “Ground” 


Completely traced line drawings proved to 
be more effective in evincing fantasy than in- 
completely traced ones, in two studies by 


Weisskopf (1950b) and Weisskopf-Joelson 
and Lynn (1953). Bradley and Lysaker 
(1957) reported that varying the background 
of a picture, when the figure (woman baking 
a cake) remained constant, had no effect in- 
sofar as altering the productivity of responses 
to the pictures. These studies seem to indicate 
that ambiguity of the figure affects fantasy 
production, but alteration of the background 
has little effect provided the figure remains 
clear. 


Variations of Central Figures 


Thompson Modification. Murray (1943) 
suggested that at least one card should be 
chosen, showing a figure of approximately 
the same age and sex as the S. Tomkins 
(1947), on the other hand, has proposed that 
the TAT may be interpreted most meaning- 
fully by taking into account the psychologi- 
cal distance of the stimulus from the S. More- 
over, the meaningfulness of the responses 
elicited by the TAT should be a function of 
the “remoteness” of the thematic material 
presented. 

Thompson (1949), adhering to Murray’s 
suggestion, believes that the closer the stimu- 
lus resembles the actual S, the more the S 
will identify with the figure and, accordingly, 
be likely to produce more meaningful mate- 
rial. To test this hypothesis, he constructed 
a set of TAT cards similar to the original 
TAT, except that Negro characters were sub- 
stituted for white ones. Using 26 Negro 
college students from a Negro school, the 
Thompson TAT was group-administered by 
a Negro examiner who flashed the cards on 
a screen by means of a projector. The Ss 
were instructed to write stories to the pro- 
jected pictures. In order to compare the Mur- 
ray and Thompson versions, the Ss were di- 
vided into two groups of 13 each, with half 
receiving the Thompson series first and the 
Murray version second, while the other half 
received both series with the order of pres- 
entation reversed. Thompson found a signifi- 
cant increase (p < .01) in story length to the 
Thompson TAT for each of the 10 cards 
used. 

More recent studies, however, have been 
highly critical of the value of the Taompson 
modification. Riess, Schwartz, and Cotting- 
ham (1950) used 30 Negroes and 30 white 
females from Hunter College in New York 
City. The usual AB, BA order for the two 
TAT tests was followed, using a Negro ex- 
aminer for half of both the Negro and white 
group, and a white examiner for the other 
half. The results were: 


Negroes and whites in the North produce stories 
that differ insignificantly in length regardless of 
whether the stimulus material is Negro or not, and 
regardless of the color of the examiner, with the ex- 
ception of a: tendency for northern whites to increase 
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story length on Negro stimulus material with a Ne- 
gro examiner (Riess et al., 1950, p. 708). 


In another article using the same data 
(Schwartz, Riess, and Cottingham, 1951), 
the authors investigated the number of ideas 
appearing in the stories rather than merely 
the story length. Their data showed that (a) 
Negroes expressed the most ideas when given 
the cards by a white examiner. The set of 
cards used was of little consequence. (5) 
Whites gave more ideas when the Thompson 
set was used. This result was not a function 
of whether the examiner was white or Negro. 
(c) The production of ideas was quite low 
when the stimulus, examiner, and Ss were of 
the same race. 

Korchin, Mitchell, and Meltzoff (1950) 
used two groups of 80 Negro and 80 white 
male Ss from Philadelphia, half of each group 
being “middle class” while the other half was 
of “low” socioeconomic status. The examiner 
was white. An analysis of variance showed no 
significant effects due to race, nor any sig- 
nificant interaction between race and status 
for story length. Only the “class” differences 
were significant (p < .01), with the “middle 
class” telling the longer stories. 

In still another experiment, Light (1955) 
divided 26 white students into two groups of 
13 each, one-half receiving the Murray TAT, 
while the other received the Thompson ver- 
sion. No significant differences in story length 
were found, although Light found certain 
themes to be more frequent with the Thomp- 
son set. These themes were crime, poverty, 
occupational inferiority, witchcraft, and pros- 
titution. 

Cook (1953) used 60 male college students 
(30 Negro, 30 white), divided into 4 groups 
of 15 each. One-half of the Negro and white 
groups each received the Thompson TAT, 
while the remaining two groups receiving the 
Murray TAT were composed of the remain- 
ing Negro and white Ss, respectively. The ex- 
aminer here also was white, and most of the 
Ss had spent at least 15 years in the South. 
Cook wished to examine the relation ship be- 
tween tendency toward ego-defensiveness as 
a function of a decrease in remoteness be- 
tween the S and the stimuli used. The meas- 
ures of ego-defensiveness used were word 


count, compliance with instructions given, 
vagueness of stories, number of words indi- 
cating uncertainty, number of alternatives of- 
fered, number of references to the pictures 
(i.e., excessive use of description), and num- 
ber of different themes offered. No significant 
differences solely attributable to the two sets 
of pictures were found. Both the Negroes and 
whites were interviewed with regard to their 
perception of the test. It was found that the 
great majority of the Negroes perceived both 
the Murray and Thompson TATs as dealing 
with “people in general.” On the other hand, 
the whites perceived the Thompson TAT as 
dealing with Negroes rather than “people in 
general.”” Hence, it would be expected that 
the chief differences between the Negro and 
white Ss would be found with regard to the 
Thompson TAT. This was precisely what 
happened. The Murray TAT yielded only 
two significant findings at the .05 level. Ne- 
groes used more words indicating uncertainty 
and made more references to the pictures. 

On the Thompson TAT, however, the pro- 
tocols of the whites had a significantly higher 
word count. The Negroes had a greater num- 
ber of alternatives, uncertainty, vagueness, 
and references to the pictures. All measures 
were significant beyond the .01 level. Cook 
believed his findings indicated that the more 
remote the relationship between the stimulus 
and the S, the less the ego-defensiveness. 

The foregoing studies lead to a considera- 
tion of the following: 

1. Are Negroes to be treated as a homo- 
geneous class? The data hardly suggest that 
such a treatment is warranted. Apparently, it 
would be more justifiable if anything to use 
a “socioeconomic class” approach to inter- 
pretation. 

2. The degree of similarity between stimu- 
lus and subject is of itself insufficient as an 
explanation of the type of response elicited be- 
cause it does not take cognizance of the back- 
ground characteristics. Thus, in Schwartz's 
(1951) work, Negro Ss who were adminis- 
tered Negro cards by a Negro examiner 
manifested a smaller number of ideas in their 
stories than any other grouping. Negro Ss re- 
ceiving the same cards from a white examiner 
expressed the greatest number of ideas. For 
the white Ss, however, the Negro cards 
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seemed to educe the greatest number of ideas 
regardless of the race of the examiner. 

3. In view of the fact that the Stimulus < 
Background xX Person interaction seems to be 
of considerable importance in determining the 
nature of the response, the ¢ test favored by 
previous investigators would appear to be a 
relatively insensitive statistic. The analysis of 
variance would appear to be more powerful 
in this regard. 

4. The question arises as to whether story 
length or number of ideas expressed is an ade- 
quate measure of the S’s involvement in the 
TAT. Is it not conceivable that sophisticated 
Ss (college students) might manifest an ex- 
cessive superficial verboseness to mask the 
projection of manifest or covert needs? What 
is needed is a quantitative scale measuring 
the personality meaningfulness of the stories 
rather than indirect and unproven correlates 
such as “word-count” and “number of ideas.” 

5. The influence of the stimulus is not a 
simple function of its similarity to the S. The 
culture plays a crucial role in the interpreta- 
tion of perception. Negroes tended to perceive 
the Murray TAT characters not as “whites” 
but as “people in general.” Whites, however, 
perceived the Thompson cards as dealing with 
“Negroes.” Thus, majority populations are 
far more apt to perceive minorities as “dif- 
ferent” than vice versa. They also are less 
apt to have had contact with them. Minority 
groups such as Negroes, however, must con- 
stant'v live in a “white” world. Although cer- 
tainly aware of the special privileges whites 
possess which they do not, they must adapt 
perceptually to the majority white culture if 
they are to be maximally adjusted to their 
environment. Thus, the finding that Negro 
girls preferred white dolls to Negro ones 
(Goodman, 1952) should come as no sur- 
prise. When, however, Negroes are confronted 
with the Thompson figures, particularly in 
the presence of a white examiner, it is appar- 
ent to them that something unusual and dif- 
ferent is involved, and they tend to be cau- 
tious and vague. 

Other Modifications. Lasaga y Travieso 
and Martinez-Arango (1946), in working 
with nuns, found no improvement in the diag- 
nostic value of the TAT stories when they 
substituted nuns for the usual TAT central 
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figures. Weisskopf and Dunlevy (1952) used 
three groups of 10 each of normal, crippled, 
and obese male undergraduates. Each S re- 
ceived three sets of TAT cards, one of which 
was unmodified, the other sets being modi- 
fied so that the central figure was obese in 
one set and crippled in the other. The Ss 
were asked to describe the cards rather than 
to tell a story. The results indicated that the 
mean transcendental index (number of ele- 
ments in the stories going beyond pure de- 
scription) was significantly less for the obese 
figures than for the normal and crippled ones. 
This trend was consistent among all three 
groups. 

In another study, Weisskopf-Joelson and 
Money (1953) modified the TAT with some- 
what crude approximations of the original 
drawings. In the first series, the head of the 
central character was a more or less neutral 
face. In the second session, the same cards 
had actual photos of the Ss’ heads instead of 
the original neutral heads. A control group 
was used to account for the effect of repeti- 
tion. No significant increase in projection was 
found for the “photo” set over the neutral 
set as measured by the transcendence index 
and word content. In addition, the diagnostic 
value of the “photo” set when judged by two 
judges was not found to be any higher than 
the other set. 

In a recent study by Edgar and Shneidman 
(1958), a group composed of psychotics, psy- 
choneurotics, and persons suffering from per- 
sonality disorders was used. Material obtained 
from “patient government” discussions was 
compared with that obtained from the MAPS 
technique, and a variation of this technique, 
in which full length cut-out photos of every 
person in residence on the ward at the time 
as well as all staff members, was used. Using 
Bales Interaction Process Method to analyze 
the data, the authors found as much reluc- 
tance to show antagonism to photos of peer 
figures as to peers in face-to-face contact, 
such as occasioned by the group meetings. 
They suggest that one use of the “photo” 
technique may be in measuring the feelings 
of need for evaluation from others and feel- 
ings of need for direction. Nevertheless, the 
authors concluded that “photos, in general, 
seem to elicit more positive feeling than fan- 
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tasy figures did, the latter calling forth more 
overt aggression” (Edgar and Shneidman, 
1958, p. 11). 

It would appear then that similarity of the 
central character to the S is not only non- 
effectual in furthering projection but actually 
may be detrimental. Since obesity is the butt 
of jokes in our society, it is unlikely that 
fat persons derive any satisfying tension re- 
duction in identifying with obese characters. 
Before concluding, however, that similarity 
per se inhibits projection, it may be of value 
to recall that the background characteristics 
(purpose of the experiment) are much more 
readily apparent to the S under these condi- 
tions. Thus, while the S taking the TAT un- 
der the usual nomothetic experimental pro- 
cedure may not suspect that his responses as 
an individual will be noteworthy, the S who, 
as in Weisskopf-Joelson and Lynn’s study 
(1953), sees a photo of his head planted 
somewhat out of context on a crude sketch 
may not suffer this delusion. 

It may be that under nonsocial conditions 
similarity might well stimulate projection. In- 
deed, the mirror probably does richly stimu- 
late fantasies in teen-agers as they carefully 
prune themselves before going out on a 
“heavy date.” Some support for this state- 
ment comes from the data of Beier, Izard, 
Smock, and Tougas (1957) who gave young 
adults sets of photos of younger, peer, and 
older persons having them signify the ones 
liked and disliked. Male Ss most frequently 
“liked” peer males while female Ss preferred 
peer females. 

In sum, however, as far as the usual test- 
ing procedure is concerned, it seems clear that 
similarity to the point of idiosyncratic identi- 
fication promotes. ego defensiveness and a re- 
duction in the degree of projection. 


The Use of Animals Instead of Humans 


Bellak and Bellak (1949), believing that 
children identify more readily with pictures 
containing animal figures than with human 
ones, created the Children’s Apperception 
Test. Bills (1950) compared a series of pic- 
tures of rabbits with several TAT cards using 
as Ss children (both male and female) aged 
five through ten. Significantly longer stories 


were obtained with the “rabbit” pictures. In 
another study (Bills, Leiman, & Thomas, 
1950), a new group of third-grade Ss had six 
play-therapy sessions and then took the “rab- 
bit” pictures test and TAT. Both tests corre- 
lated only slightly with the material obtained 
from play therapy using Murray’s rating ap- 
proach for manifest needs. Their correlations 
with each other also were rather low, ranging 
from — 09 to .58. 

Biersdorf and Marcuse (1953) used a broad 
approach to the problem employing CAT 
cards and cards specially constructed, which 
were almost identical to the CAT cards ex- 
cept that humans were substituted for the 
animals while keeping the same emotional ex- 
pressions. The Ss were 30 first graders, the 
group containing both sexes. Not only were 
no significant differences found between num- 
bers of words used in both tests, but further 
analysis revealed no difference in length of 
time before response, length of response time, 
number of words used, number of ideas used, 
number of characters mentioned in the pic- 
tures, and the number mentioned who were 
not in the pictures. Mainord and Marcuse 
(1954), using the same sets of cards, tested 
emotionally disturbed boys and girls (mean 
age, seven years) and found similar nonsig- 
nificant differences for the aforementioned 
quantitative measures. They, however, also 
asked five clinicians to rate the protocols as 
to “clinical usefulness,” the judges being ig- 
norant of the hypotheses of the authors and 
the stimulus properties of the cards eliciting 
the protocols. The human figure cards were 
found to be more “clinically useful” at better 
than an .001 level of confidence. 

In yet another study comparing the CAT 
with its “humanized” analogue, Furuya 
(1957) studied the protocols of 72 children 
ranging in age from 6 to 12. He was inter- 
ested in answers to the following two ques- 
tions. 

1. With different criteria and age-groups 
from those of Biersdorf and Marcuse (1953) 
and Mainord and Marcuse (1954)—what 
kind of difference will be found between the 
productivity to animal pictures to human 
pictures, equivalent in scenes and situations? 

2. Is there any tendency toward relative 
decrease of productivity to animal pictures 
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compared with human pictures as the children 
grow older (Furuya, 1957, p. 248)? 

The results are confounded with methods of 
presentation in that the youngest children re- 
ceived the cards individually, while the older 
ones received the cards as a group. Neverthe- 
less, the results are consistent with earlier 
studies in indicating the superiority of the 
human pictures. Of the 20 ¢ tests, four proved 
to be significant at the .05 level or better. 
These indicated that the percentage of stories 
containing expressions of feeling, significant 
conflict, and having a definite outcome, was 
significantly higher for the human pictures 
as compared to the animal ones. 

No tendency was found for any increase or 
decrease in the relative productivity of the 
animal or human pictures from the youngest 
to the oldest age range. 

Lastly, the author implies that the results 
might have been even more extensive but for 
the fact that “most of the animal pictures 
that have been used, were ‘personified or hu- 
man-like’ animal pictures” (Furuya, p. 252). 

Light (1954), working with 9- and 10-year- 
olds, found significant differences between the 
CAT and TAT. The TAT was found to elicit 
more feelings, different kinds of feelings, con- 
flicts, number and kinds of outcomes, num- 
ber and kinds of themes, and number of fig- 
ures. Only the number of words was not sig- 
nificantly different from test to test. Light 
believed that this result may have been due 
to the imposed time limit of seven minutes to 
write the stories after the pictures had been 
flashed on a screen. 

Armstrong (1954) working with intellec- 
tually superior first, second, and third graders 
also found no difference in number of words 
used between the CAT and a set of pictures 
corresponding in style and composition but 
containing humans instead of animals. The 
Weisskopf transcendence index, however, in- 
dicated that the children gave significantly 
more nondescriptive statements to the hu- 
man series. 

The results of the foregoing studies do not 
support the Bellaks’ assumption that the use 
of animal pictures results in an increased 
facilitation of projection for children. For the 
most part, where the human and nonhuman 
drawings have been strictly comparable (Arm- 
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strong, 1954; Biersdorf and Marcuse, 1953), 
the human pictures have been shown to be 
equal or superior to the animal pictures. The 
only negative note sounded (Bills, 1950) is 
not strictly comparable since the “rabbit pic- 
tures” utilized may have been conducive to 
projection due to the nature of the situation 
portrayed on the cards rather than to the 
presence of animal figures of themselves. 


Variations in Ambiguity Between TAT Cards 


Murray (1943), in the manual for the The- 
matic Apperception Test, described the cards 
as “. . . divided into two series of ten pic- 
tures each, the pictures of the second series 
(numbers 11 through 20), being purposely 
more unusual, dramatic, and bizarre than 
those of the first. One full hour is devoted to 
a series, the two sessions being separated by 
a day or more” (p. 2). 

Weisskopf (1950a) found that college stu- 
dents, when asked to describe both sets, pro- 
jected significantly more fantasy material to 
the first “everyday” series than to the lat- 
ter “fairy tale” one. Of interest also was the 
fact that women tended to have significantly 
higher “transcendence” indices than men for 
both the “male” (p < .02) and “female” se- 
ries (p < .05). A heterosexual group of col- 
lege students were asked by Bijou and Kenny 
(1951) to rank 21 “male” series TAT cards 
according to their ambiguity (estimated num- 
ber of interpretations that might be derived 
from each card). The authors computed a ¢ 
test between the mean ambiguity level for the 
first Murray series as compared'to the second 
one. The second “fairy tale” series proved to 
be more ambiguous, at the .05 level of con- 
fidence, although the authors did not reject 
the null hypothesis in postadopting an .01 
level. They also compared the “personality 
revealingness” value of the first series as com- 
pared to the second one by analyzing the 
themes of the TAT stories (Kenney and 
Bijou, 1953). The mean “personality reveal- 
ingness” value of the “everyday” series proved 
to be somewhat higher than the “fairy tale” 
series, but the ¢ test of .92 was not statisti- 
cally significant. 

Using the same “male” series of 21 cards, 
the authors found a rho of .75 between 28 
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male and 23 female college students. Since 
the multijudge reliability was .80, there ap- 
peared to be little sex difference with regard 
to the judgment of ambiguity. Dividing 15 
cards, selected from the 21, into three groups 
of 5 cards each, according to their high, me- 
dium, or low ambiguity, they found that the 
medium-ambiguous series yielded themes of 
greater “personality revealingness”’ than either 
the low or highly ambiguous series. The dif- 
ference in “revealingness’ was not a function 
of the length of the story told. Dividing the 
15 cards into the dichotomy used by Murray, 
8 were found to belong to the “everyday” se- 
ries, while 7 were assigned to the “fairy tale” 
series. No significant difference in “transcend- 
ence’ was found between the two groups of 
cards. The interaction between ambiguity, 
“transcendence,” and exposure time showed 
that the medium-ambiguity pictures yielded a 
greater amount of “transcendence” than the 
high-ambiguity under long exposure 
time (2 minutes) but not under short ex- 
posure time (5 seconds). The medium-am- 
biguity group of pictures did, however, show 
a higher “‘transcendence” than the least am- 
biguous pictures under both long and short 
exposure time. 

Murstein (1958b), using Bijou and Kenny’s 
(1951) rankings of ambiguity of a “male” se- 
ries of cards with some modifications, found 
significant curvilinear correlations with the 
number of themes elicited from an all-male 
population composed of both college students 
and psychiatric patients used by Eron (1950). 
The moderately ambiguous cards produced 
the most themes, while the high- and low- 
ambiguous cards were less productive. While 
the curvilinear correlation involving the het- 
erogeneous group and male-only group proved 
to be highly significant, the use of the fe- 
male rankings of ambiguity did not yield a 
significant difference between eta and the 
Pearson r. The range of etas was from .43 to 
.95, the Pearson’s r’s ranging from .03 to .09. 

In another study, Murstein (1958a) had 12 
female Ss rank a “female” series of 20 cards 
for “pleasantness” as well as for “psychologi- 
cal ambiguity.’”’ When compared to the “emo- 
tional tone” scores obtained by Eron (1953) 
from a similar female population, the follow- 
ing average correlations were obtained: am- 
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biguity vs. emotional tone, .56; ambiguity vs. 
pleasantness, .40; pleasantness vs. emotional 
tone, .59. Thus, the results indicated that the 
more ambiguous the TAT card, the more 
“pleasant” the “emotional tone” of the story 
was found to be; the more “pleasant” the 
stimulus properties of the card, the more 
“pleasant” the “emotional tone” of the stories. 

The results of the ambiguity experiments 
seem to show that the more ambiguous a pic- 
ture is, the more pleasant are the themes 
elicited in the stories told to it. The more 
structured pictures are usually negatively 
structured (Murstein, 1958a) and, hence, 
more likely to elicit unpleasant themes. 
The medium-ambiguous pictures usually elicit 
more themes as well as more “personality re- 
vealing”’ material. 

In appraising the work reviewed in this pa- 
per the following conclusions seem warranted: 


Conclusions Derived From TAT Modifica- 
tions 


1. A lack of visual stimulation (darkness) 
seems to enhance the occurrence of pleasant 
association to TAT-type cards. The effect of 
supranormal illumination is more complex 
with a slight increase resu!ting in a mani- 
festation of more pleasant responses than un- 
der normal lighting. As the intensity of illu- 
mination is further increased, however, the 
pleasantness of association diminishes. 

2. Mere similarity between the S and the 
central figure in the stimulus card is insuffi- 
cient to expedite the projection of mean- 
ingful responses. Projection is most readily 
elicited when the stimulus figure is culturally 
high in status and yet not so similar to the 
S as to arouse his suspicion as to the pur- 
pose of testing. It is not similarity per se that 
accounts for the censoring of responses, but 
the new awareness of the background charac- 
teristics (i.e., purpose of the testing) which 
causes the S to filter his responses. Other 
background factors such as the race of the 
examiner may also play a part in determin- 
ing the nature of the response elicited. 

3. The fact that Negro Ss when tested by 
white examiners tend to be superficial in ap- 
proach, hewing closely to mere description of 
the thematic cards, hardly means that they 
do not have tensions which could be percep- 
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tually expressed. Rather, the lack of involve- 
ment may have been a function of a low ex- 
pectancy of satisfying these needs even to the 
extent of verbalizing them, because of the 
inhibiting presence of the examiner. With a 
Negro examiner, however, the stories told by 
Negroes would probably be more meaningful, 
though this particular comparison has not as 
yet been made. 

The tendency of whites to give richer 
stories to the Thompson series rather than to 
the Murray TAT would seem to be a way of 
expressing either a need for verbalization of 
tolerance or a feeling of freedom in talking 
about a socially subservient group. Presum- 
ably, the Northern Ss would be most prone 
to strive for tolerance in their stories, while 
Southern students (Light’s group?) might be 
more apt to play upon the social and eco- 
nomic inferiority of Negroes. 

4. Racial differences as determiners of pro- 
jection have been overestimated, while socio- 
economic factors have been underplayed. 
Middle-class Negroes may have more in com- 
mon with middle-class whites with regard to 
perceived needs, than have middle-class Ne- 
groes with lower-class Negroes. 

5. Animal pictures have been shown to be 


somewhat less effective than human pictures 
in eliciting meaningful projective responses 
from children. 


Conclusions Derived From the TAT 


1. Increasing the “psychological” ambigu- 
ity of a picture does not result in an increased 
response to the picture. Rather, the relation- 
ship is curvilinear with an increased response 
productivity being followed by a decrease as 
the gestalt of the picture begins to disappear. 

2. The meaningfulness of stories for per- 
sonality evaluation seems to follow a similar 
curvilinear relationship with ambiguity, with 
the medium-ambiguous pictures facilitating 
the expression of the most revealing stories. 

3. The “pleasantness” of stories seems to 
increase as ambiguity increases. This occur- 
rence may be a function of the fact that most 
highly structured cards on the TAT are nega- 
tively structured. Another factor may be that 
as the stimulus “pull” becomes weaker, the 
personality needs of the S, as well as the 
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background factors such as the personality 
of the examiner, the reason the test is being 
given, and the locale of the test, become in- 
creasingly important. It seems reasonable to 
assume that most persons for cultural reasons 
prefer to see “good” rather than “bad.” Also 
of importance for college students who form 
the hard core of Ss used in these experiments 
is the need of impressing and favorably in- 
fluencing the examiner who may well be per- 
ceived as a teacher-surrogate. Under these 
conditions, the exceedingly ambiguous: pic- 
ture may act as a cue to set the S on his 
guard to put his best perceptual foot for- 
ward. 

4. The effect of varying the time of ex- 
posure is minimal provided the S may per- 
ceive the cards for at least a few seconds. 
Presenting the card for only a fraction of a 
second seriously affects the productivity of re- 
sponse. Allowing the picture to be exposed 
for two minutes has little if any advantage 
over showing it for five ‘seconds. 

5. The area of sex differences with regard 
to stimulus ambiguity is largely unknown. 
More information might be obtained if both 
sexes were given both the “male” and “fe- 
male” series and comparisons made taking 
into consideration the sex of the figure de- 
picted on the cards. 

The knowledge that medium-ambiguous 
pictures produced the richest material per- 
sonality-wise in the studies reviewed repre- 
sents only a beginning in the understanding 
of the relationship between stimulus proper- 
ties of the TAT and the responses educed. 
Medium-ambiguity may stem from the task 
of perceiving what is in the picture, or it may 
refer to the interpretation of what is clearly 
perceived. An inspection of the medium-am- 
biguity TAT cards leads to the observation 
that the objects in the pictures seem fairly 
clear; what is vague is the uncertainty as to 
the feelings the characters in the pictures 
seem to be experiencing. Another important 
feature of the medium-ambiguity category is 
the presence of humans on each of these 
cards. Thus, it would appear that the most 
powerful cards for eliciting idiosyncratic pro- 
jective material are cards which contain hu- 
mans who are fairly easily identified but 
whose expressions and feelings are capable of 
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multiple interpretation. Thus, in Card 13 MF 
it is obvious to most persons that an appar- 
ently nude woman is lying beneath the covers 
of a bed, while a young man stands fully 
clothed with his arm covering his face. The 
ambiguity resides only in what the characters 
are doing and feeling. Maximum use ought to 
be made, therefore, of cards clearly depicting 
humans, whose behavior, however, is not 
readily discernible. Such cards would appear 
to have more appeal for Ss than fantastic 
“fairy tale” type pictures, at least with regard 
to normal populations. 

In reviewing the earlier studies, the impor- 
tance of stimulus control and background fac- 
tors has been shown to play crucial roles in 
the elicitation of responses to TAT-type tech- 
niques. It would appear that the emphasis 
upon the TAT’s ability to tap the “private 
world” of the individual has hindered cli- 
nicians from fully utilizing other equally 
meaningful determinants of perception. In 
fact, one of the key repeating factors seem- 
ing to run through the studies covered has 
been the fact that this “private world” has 
been revealed only under rather specific con- 
ditions and with specific cards. One must con- 
clude, therefore, that much of the thematic 
material educed is obtained with the permis- 
sion of the S, and is subject to his control. 
This statement is in accordance with Lindzey 
and Tejessy’s finding (1956) that TAT hos- 
tility indices correlated more highly with the 
Ss self-concept than with observer or clinical 
ratings. Moreover, Allport (1953) has said, 
“. .. normal subjects . . . tell you by the 
direct method precisely what they tel! you 
by the projective method. They are all of a 
piece. You may therefore take their motiva- 
tion statements at their face value, for even 
if you probe, you will not find anything sub- 
stantially different” (p. 110). 

If the Ss’ control of the testing situation is 
so pervasive, how can we, as Clinicians, de- 
sign tests so as to circumvent this control and 
obtain meaningful glimpses into that “private 
world” which is not offered for public con- 
sumption? The earlier work seems to indi- 
cate that medium-ambiguous pictures should 
be used where the figures are humans, cul- 
turally acceptable as objects of identification 
and yet not so similar to the S as to cause 


him to become overly self-conscious and 
heavily censor his responses. 

Finally, the task of quantifying the stimu- 
lus, background, and organismic variables is 
a formidable one. The stimulus value of per- 
sonality variables such as aggression and sex, 
has been scaled for TAT-type pictures, using 
the Guttman scaling technique (Auld, Eron, 
and Laffal, 1955). Examiner attitudes and be- 
havior could also be measured through an 
amalgam of scaling procedures and observa- 
tion and personality tests, as could the be- 
havior of the S apart from the testing situa- 
tion. The quantification of these variables 
would lend meaning to the responses obtained 
via the projective techniques. Currently, for 
example, one is often at a loss to determine 
whether responses on the TAT reflecting hos- 
tility are part of the S’s perceptual world or 
merely reflect a respect for the objective 
properties of the cards. By means of the 
Stimulus X Background x Organismic ap- 
proach, however, such questions may be an- 
swered as well as ones regarding the relation- 
ship between responses and overt behavior. 
Such an approach may well lead to a more 
homogeneous thematic theme test whose 
stimulus values are known for various per- 
sonality dimensions with regard to different 
populations, thus mitigating the manifold 
problems of interpretation. 
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The concept of mental illness as a disease 
process has resulted in the focusing of psy- 
chiatric attention on the symptoms which a 
“mentally ill” patient manifests. The extent 
to which a mentally ill person improves is 
traditionally based on the extent to which the 
original symptom picture. becomes modified. 
As with most disease entities, the patient is 
judged as having recovered when the symp- 
toms of the disease abate. 

Descriptive psychiatry, as practiced in 
most mental hospitals today, is based on the 
diagnostic classifications outlined by Kraepe- 
lin. His observation that the praecox pro- 
gressed to a demented end-state led him to 
believe that this process was based on some 
degenerative “disease” process. Later evidence 
discredits the idea of the necessary demented 
end product, and, to a lesser extent, that this 
disease occurs only in young adults. The origi- 
nal evidence, then, for regarding dementia 
praecox as a disease process is no longer con- 
sistent with observations following those of 
Kraepelin. The hypothesis may still be true, 
but the evidence upon which the hypothesis 
was based is no longer accepted. 

The agreement of the American Psychiatric 
Association on a classification system in 1934 
was a major step forward. Professional per- 
sonnel had at last come to agree on a com- 
mon system which facilitated communication 
and research. It is unfortunate, however, that 
many psychiatrists still regard the classifica- 
tions as synonymous to disease entities, al- 
though the majority of the psychiatric group 


1JIn cooperation with Psychiatric and Nursing 
Services, particularly Anna Kontas, who supervised 
the collection of all admission behavioral data. This 
study was supported by the Psychiatric Evaluation 
Project, Richard L. Jenkins, Director. 

2 Now at VA Hospital, Ft. Meade, South Dakota. 


regards the classifications as reaction patterns 
rather than disease entities. This is the cur- 
rent status of the Veterans Administration’s 
nomenclature, adopted in 1952. 

The importance of the behavior patterns in 
the schizophrenic reactions was pointed out 
by Adolf Meyer. He and his students, notably 
Norman Cameron, emphasized the influence 
of social factors in the development, onset, 
and form of these reactions. Indeed, Cameron 
(1947) prefers to include the schizophrenics 
under a broad classification of Behavior Dis- 
orders. The implications which arise from the 
use of “disease,” “symptoms,” “mental ill- 
ness” are thus avoided by Cameron. 

The assessment of behavioral adjustment 
offers another approach in the evaluation of 
psychiatric patients. Many psychiatric pa- 
tients who exhibit a minimum of psychiatric 
symptomatology are unable to leave a men- 
tal hospital and lead productive lives. Other 
patients, despite extensive psychopathology, 
leave mental hospitals and become produc- 
tive citizens. The extent of psychopathology, 
therefore, is not necessarily related to the 
adequacy of social functioning, although there 
is generally a high incidence of behavioral 
maladjustment jin severely “ill” psychiatric 
patients. 

This paper, then, presents some data on 
the relationship between behavioral adjust- 
ment and psychopathology as related to each 
other and to the level of posthospital adjust- 
ment. Perhaps the most stringent criterion of 
any measure of adjustment in psychiatric 
patients is the relationship between such a 
measure and posthospital adjustment. Such 
an analysis is also attempted in this paper, 
relating level of behavioral adjustment and 
psychopathology at time of discharge to the 
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level of social adjustment three months fol- 
lowing discharge. 


Procedure 


Between August 1956 and May 1957, every 
functional psychotic patient under 60 years 
of age, without major physical illness, having 
been hospitalized no longer than 90 days dur- 
ing the preceding 6 months, admitted to the 
Fort Douglas VA Hospital, became a subject 
in this study. The patient was interviewed 
upon admission and rated (RBE) on a psy- 
chopathology scale. A psychiatric nurse and 
aide rated him independently on a behavioral 
adjustment scale following three days’ obser- 
vation. All personnel using the behavioral ad- 
justment scale were trained raters. 

For purposes of this study, part of this 
admission group (VN = 78) was broken into 
two additional subgroups for further study: 
1, Those who were discharged from the hos- 
pital within 6 months (N = 27). The pa- 
tients in this group were interviewed and 
again rated on the psychopathology scale at 
the time of discharge, and were also rated in- 
dependently by a psychiatric nurse and aide 
on the behavioral adjustment scale at this 
time. Twenty-five of the 27 patients in this 
group were personally interviewed 3 months 
following discharge (WHC) and rated on a 
Posthospital Adjustment Scale (described be- 
low). These 25 included all patients who had 
been contacted and interviewed at the time of 
this analysis. 2. Those who remained in the 
hospital 6 months after their admission (NV = 
11). All rating scales were also completed on 
this group. The remaining 40 patients who 
were in neither of the subgroups were: (ca) 
those whose exit ratings could not be com- 
pleted (no trained aide or nurse rater) and 
elopements; (2) those who were neither dis- 
charged nor had resided in the hospital for a 
period of 6 months at the time of this analysis. 


Psychopathology Scale 


A modified Lorr Multi-dimensional Rating 
Scale (1953) was used in this. study. The Lorr 
scale was modified to meet the needs of the 
Psychiatric Evaluation study and subjected to 
extensive reliability analysis. Several items 
were dropped if they failed to reach an ac- 
ceptable degree of interrater reliability (13 


raters independently rating 52 patients). The 
items that remained included manifest and 
reported depression, manifest and reported 
anxiety, apathy, reported hallucinations, de- 
lusions or morbid suspicion, thought disor- 
der, rapport, disorientation, bizarre posturing, 
memory deficit, lack of motivation, excessive 
hostility, withdrawal, evasion, and coopera- 
tiveness. These items represent, then, a sam- 
pling of Lorr’s psychopathology factors. 

Seven psychiatrists were asked to judge: 
“at what level (rating) each symptom would 
have to be manifest in an interview before 
you would be confident of the presence of 
underlying psychopathology.” The level at 
which at least 6 out of 7 agreed, was accepted 
as the psychopathology level. All 7 agreed 
completely on 74% of the items. Each level 
beyond this judgment was scored 1 point to- 
ward a Total Psychopathology score. Seven- 
teen items (described above) were judged 
as indicating psychopathology. On these 17 
items, a patient could score 31 pathology 
points, since most items contained 2 levels 
(ratings) beyond the judged point of psy- 
chopathology. A Total Psychopathology score 
was thus obtained; the higher the score, the 
more extensive the psychopathology. Along 
with the author one additional rater inde- 
pendently rated 17 patients with a resultant 
rho of .84 between the Total Psychopathology 
scores for the 2 raters. The reliability of the 
Total Psychopathology score was thus found 
to be within acceptable limits. 

An intercorrelational analysis of all psycho- 
pathology items (phi correlation, N of 78 rat- 
ings) revealed that each item was “relatively” 
independent of every other item in the scale. 
Although several items were significantly in- 
terrelated, the highest phi was .51, which ac- 
counted for only 25% of the variance be- 
tween the 2 items. The relative independence 
of each of the items permitted the scoring of 
a Total Psychopathology score, with each 
item contributing to the total score. 


Behavioral Adjustment Scale 


The MACC Behavioral Adjustment Scale 
(1957) was used in this study. The develop- 
ment, reliability, and validity outlined in the 
manual are not reported here. The scale meas- 
ures 4 relatively independent behavioral clus- 
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ters: Motility, Affect, Cooperation, and Com- 
munication. The latter 3 areas constitute the 
Total Adjustment score. (See Ellsworth, 
1957.) 

The ratings of 2 independent raters (nurse 
and aide) were averaged, thus decreasing the 
interrater error. The interrater reliability of 
the 2 admission raters (Total Adjustment 
scores, 78 paired ratings) was .89. 


Posthospital Adjustment Scale 


As part of the Psychiatric Evaluation proj- 
ect,* 25 psychiatric patients were followed 
into the community and rated in 6 areas 
of social adjustment. These included occupa- 
tional (or school) adjustment, management of 
funds, family adjustment, “other” interper- 
sonal relationships, recreational adjustment, 
and community adjustment. One of the au- 
thors (Clayton) developed a 5-point rating 
scale for each of these posthospital adjust- 
ment areas. For example, a patient who is 
employed full time receives a rating of 1 in 
the area of occupational adjustment, while a 
patient who refuses to seek work and does 
not help at home is rated 5. A low total score 
(for the 6 areas) thus indicates “good”’ post- 
hospital adjustment, while a high score indi- 
cates “poor” posthospital adjustment. 

A social worker (Clayton) visited the pa- 
tient and his family 3 months following dis- 
charge and rated each patient in the 6 post- 
hospital adjustment areas. He was not aware 
of either the patient’s behavioral adjustment 
or pathology ratings at the time of the inter- 
view and rating. Following the interview, he 
wrote a narrative report describing each pa- 
tient’s social situation and conditions under 
which he was living, including the behavioral 
patterns and relationships in each of the 6 
areas of adjustment. Two additional social 
workers independently rated all 25 patients 
from this narrative report, with a resulting 
average intercorrelation of .82 between all 
raters on the total scores. Thus, the Posthos- 
pital Adjustment Scale appears to result in 
satisfactorily reliable scores which permit the 
scoring of each patient’s over-all posthospital 
adjustment. 

8 This VA project, to evaluate the effectiveness of 
treatment of several VA hospitals, is directed by 
Richard L. Jenkins and Lee Gurel. 


Results 


a. Table 1 presents the intercorrelations 
between Total Psychopathology scores and 
Behavioral Adjustment areas upon admission. 
The results indicate that the higher the be- 
havioral adjustment, the lower the extent of 
psychopathology. In other words, the more 
pleasant, cooperative, and communicative the 
patient was, the /ess apt he was to show ex- 
tensive psychopathology. The relatively high 
intercorrelations between two different kinds 
of scales, rated independently, are somewhat 
surprising. This would seem to suggest that 
these two assessment approaches of the ad- 
justment of the newly admitted patient with 
a psychotic reaction, whether it be from a 
behavioral or psychopathological standpoint, 
have much in. common. 

5. The amount of improvement (score 
change) on both behavioral adjustment and 
psychopathology was analyzed. The score 
changes of Subgroup 1 (those who were dis- 
charged within 6 months of admission) were 
compared on both scales. This resulted in a 
product-moment correlation of .74 (signifi- 
cant beyond .01 level of confidence). Thus, 
the discharged patient who improves in his 
behavioral adjustment tends to show a simi- 
lar amount of improvement in his psycho- 
pathology symptoms. Both methods of as- 
sessing improvement have much in common, 
as would be expected from the results of 
Table 1. 

c. The results of Subgroup 2 (those who 
were not seen as ready for discharge at the 
end of 6 months of hospitalization) are sum- 
marized in Table 2. The mean scale changes 
of this “in-hospital group” were compared 
with the mean scale changes of those actually 
discharged before reaching the 6-month point 
of hospitalization (Subgroup 1). 

[t appears that, with regard to using dis- 
charge as a criterion of “improvement,” the 
psychopathology scores do not distinguish 
between those improved sufficiently to war- 
rant discharge within 6 months and those 
who were seen as needing hospitalization be- 
yond 6 months. The behavioral adjustment 
scale scores, however, reflected a significant 
change in the “improved” (discharged) group, 





Robert B. Elisworth and William H. Clayton 


Table 1 


Intercorrelations Between MACC Behavioral Adjustment Scores and Total 
Psychopathology Score Upon Admission 
(N 








Motility 





Total psychopathology +.49* 





Affect 


78) 


Coopera- 
tion 


Total 
Adjustment 


Communi- 
cation 


— 51* — .70* — .65* — .69* 


* Significant beyond .01 level of confidence. Motility scores are not included in the Total Adjustment scores 


but not in the “unimproved” group. In so far 
as this particular VA hospital operates, behav- 
ioral adjustment appears to be more highly 
related to improvement (as measured by dis- 
charge) than psychopathology. 

d. A fourth analysis involved the relation- 
ship between the length of hespital stay with 
behavioral adjustment level and total psycho- 
pathology score at the time of admission. In 
other words, which is more predictive of the 
length of hospital stay (which is an indirect 
measure of the severity of the psychiatric dis- 
order), the extent of psychopathology or the 
level of behavioral adjustment at the time of 
admission? With 35 patients who had been 
discharged from the hospital at the time of 
this analysis (27 discharged before, 8 dis- 
charged after 6 months of hospital treat- 
ment), a product-moment correlation of .21 
(not significant) between extent of psycho- 
pathology and length of hospital stay, and a 
correlation of — .41 (significant at .01 level) 
between behavioral adjustment level and 
length of hospital stay resulted. It appears 
that the patient who has the highest behav- 
ioral adjustment upon admission tends to re- 
main in the hospital the shortest time. The 
extent of pathology symptoms on admission 


does not appear to be significantly related to 
length of hospital stay. 

e. The last analysis involved the relation- 
ship between extent of psychopathology and 
level of behavioral adjustment at time of 
discharge, to posthospital adjustment three 
months following discharge. Thus, it was pos- 
sible to relate these two measures of the ad- 
justment of psychiatric patients to the ade- 
quacy of posthospital community functioning. 
The effectiveness of hospital treatment is per- 
haps best measured against the adequacy with 
which an ex-patient handles his life following 
discharge. 

In this analysis, the product-moment cor- 
relation between extent of psychopathology 
upon discharge and level of posthospital ad- 
justment was .22 (not significant, N = 25). 
The level of behavioral adjustment at time 
of discharge, in relation to level of posthos- 
pital adjustment, was — .47 (significant be- 
yond the .02 level of confidence). Thus, the 
behavioral adjustment, as rated independently 
by a psychiatric aide and nurse at time of 
discharge, was significantly related to level 
of community adjustment 3 months follow- 
ing discharge; since the patient who adjusted 
well in the community (low score) tended to 


Table 2 
Mean Behavioral Adjustment and Psychopathology Changes of Patients Not Seen as Ready for Discharge 
Compared with Those Discharged Within Six Months of Hospitalization 


Groups N 1* 





Discharged within 6 months 
Not discharged within 6 months 


27 


11 39.3 


* Significant beyond .01 level of confidence. 


* Refers to the mean of behavioral and pathology scores at time of h 


Behavioral Mean 


40.3 46.7 


Pathology Mean 
y 1* 
6.2 
43.0 6.3 


spital admission 


> Refers to the mean of behavioral and pathology scores at time of discharge from the hospital. 
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have the Aighest behavioral rating at the time 
of discharge. 


Discussion 


Although behavioral adjustment and psy- 
chopathology might, at first glance, appear to 
be two entirely different approaches in evalu- 
ating psychiatric patients, they are both based 
on behavioral observations. Symptomatology 
might be described as abstractions from be- 
havior. For instance, two “delusional” pa- 
tients may differ widely in their behavior in 
relation to a delusion. Behavioral adjustment 
is the more concrete description of behavior, 
while psychopathology is the more abstract 
description. This explains in part their simi- 
larity as demonstrated in Results a and bd. 

A major difference between behavioral 
adjustment and psychopathology, however, 
which may partly explain the superiority of 
behavioral adjustment in relation td Results 
c, d, and e, is that psychopathology assess- 
ment has no “top.” Psychopathology scales 


range from the presence of symptoms to their 
absence. Psychopathology does not go be- 
yond the absence of symptoms, i.e., two pa- 
tients who show no delusions, thought disor- 
ganization, etc., may be very different people. 


One may be friendly and out-going while the 
other may be reserved and withdrawn. Be- 
havioral adjustment, on the other hand, ex- 
tends the assessment into measuring both the 
degree of positive as well as negative be- 
haviors. 

The analyses of the relationships of psy- 
chopathology and behavioral adjustment to 
such “external” criteria as the length of hos- 
pitalization (indirectly a measure of the se- 
verity of psychiatric condition), readiness for 
discharge (Result d), and level of social ad- 
justment three months after discharge (Re- 
sult e) revealed that an assessment of tra- 
ditional psychopathology symptoms was not 
significantly related to any of these external 
criteria. Yet, in most psychiatric hospitals, 
the “improved” patient is the patient whose 
symptoms have cleared even though such im- 
provement may not be related to how well 
the patient handles his life after he leaves 
the hospital. The present study is limited by 
the inclusion of only functional psychotic pa- 
tients and by the choice of only those psycho- 


pathology symptoms which were found to be 
acceptably reliable in an earlier study. Never- 
theless, the present study would seem to raise 
some serious doubts as to the usefulness of 
psychopathology as a meaningful criterion of 
improvement in relation to other external cri- 
teria of adjustment. The assessment of be- 
havioral adjustment as a measure of improve- 
ment would seem to offer a more promising 
approach. As was seen in Result c, patients 
often show a significant decrease in psycho- 
pathology without developing a sufficiently 
high level of behavioral adjustment to war- 
rant discharge. The relative absence of psy- 
chopathology, then, does not necessarily mean 
that the patient is an adequate social indi- 
vidual. On the other hand, many individuals 
with rather extensive psychopathology are 
able to handle themselves quite adequately in 
everyday social situations. ; 

It is of more than historical interest to 
note that the era of “moral” treatment in 
the early 19th century resulted in over 75% 
of first admission psychiatric patients released 
as cured or improved, and over 50% of these 
patients never relapsing up to 36 to 60 years 
following release (Bockoven, 1956). These re- 
sults, which would be looked upon with pride 
by most psychiatric hospitals today, occurred 
before the advent of well defined systems 
of psychopathology, and certainly before the 
psychotic process was looked upon as a dis- 
ease entity. Bockoven (1956) concludes that 
these earlier recovery rates were not a result 
of a given treatment but a result of the non- 
obstruction of the natural course of mental 
illness. Recovery occurs when patients are 
treated as human beings, and, “The moral 
therapist acted toward his patients as if they 
were mentally well . . . in impressing on pa- 
tients the idea that a change to more accept- 
able behavior was expected” (Bockoven, 1956, 
pp. 302-303). Clearly an emphasis was placed 
on behavior; and consequently, as Bockoven 
suggests, this early approach to treatment of 
mental illness must seem naive and super- 
ficial to modern dynamic psychiatry. The 
efforts of many 20th century psychiatrists, 
however, to establish mental illness as a dis- 
ease entity and to bring this field closer to 
medicine probably resulted, until recently, in 
an increased psychological distance from the 
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patient with consequent neglect of him as a 
human being. 


Conclusions 


1. Measures of behavioral adjustment and 
psychopathology are significantly related, 
warranting the use of both as measures of 
improvement in mental illness. 

2. Behavioral adjustment appears to be 
more sensitive to “patients ready for dis- 
charge,” in predicting length of hospital stay 
and in predicting the level of posthospital 
adjustment, than psychopathology. The lack 
of any significant relationship between psy- 
chopathology assessment and these “external”’ 


criteria would seem to raise some serious 
doubts as to the usefulness of psychopathol- 
ogy as a socially meaningful criterion of im- 
provement in functionally psychotic patients. 
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The Self-Activity Inventory (SAI) devel- 
oped by Philip Worchel (1957) for the U. S. 
Air Force was designed as a_ psychiatric 
screening device as well as to aid in under- 
standing self-concept variables. The SAI was 
originally devised to test the susceptibility to 
failure of pilot candidates under conditions 
of military stress. Subsequently, some SAI 
variables have been shown to be related to 
decrements in performance under stress as 
applied in the laboratory (Cawthon, Old- 
royd, & Young, 1957; Miller & Worchel, 
1957). 

Moreover, the SAI has been utilized in sev- 
eral investigations with both males and fe- 
males (Cawthon et al., 1957; Miller & 
Worchel, 1957: Wilkinson, 1956; Young, 
1956), but previous validation studies were 
conducted with males only. Because some of 
the above studies, and others by Witkin 
(1954), revealed interactions between sex and 
other variables as well as actual sex differ- 
ences, the necessity for additional validation 
which would take into account the sex vari- 
able was indicated. 

The present study was undertaken to ex- 
amine relationships between self-concept vari- 
ables as measured by the SAI and criterion 
variables consisting of group-rankings and 
self-rankings of personal adjustment, and to 
investigate possible sex differences among 
these variables. 


Method 


Population. ‘The population consisted of 50 
female and 44 male students. Each subject 
had resided in his or her fraternity or sorority 
house for at least five months. It was assumed 
that each was well acquainted with all other 


students in the group due to the intimate con- 
tact provided by group activities and general 
proximity. The participating college students 
were no longer living in their family groups 
but were required to accept new forms of 
discipline and authority relationships, to form 
new positive identifications in a context dif- 
ferent from that of the family, and to forego 
narcissistic need for self-gratification in favor 
of group standards and goals. These are cri- 
teria of personal adjustment mentioned by 
Worchel (1957) in the rationale of the SAI. 

Procedure. Identical procedures were fol- 
lowed in collecting data for males and fe- 
males. A like-sex experimenter worked with 
each group. All subjects were given the SAI 
before being asked to perform the ranking 
procedure. 

Description of the SAI. The SAI is a self- 
rating scale consisting of 54 statements de- 
scribing responses of hostility, achievement, 
sex, and dependency in particular situations. 
The scale was reproduced in two identical 
forms for males and females, except that per- 
sonal pronouns appeared in the appropriate 
gender on each form. The subject makes three 
ratings on a five-point scale for each item, ac- 
cording to the following scheme which ap- 
pears as three columns on the test blank: “I 
am a person who .. . ,” “I would like to be 
a person who . . . ,” “The average person is 
one who. . . .” Summing the ratings in each 
of the three columns yields a Self score (S), 
an Ideal Self score (I), and a score for the 
“Average Person” (A). The items are so 
worded that high scores on each of these vari- 
ables indicate a trend toward depreciation on 
the variable in question; that is, the higher a 
person’s Self score the lower is his self-esteem, 
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and similarly, the higher his Ideal-Self score 
the greater is the tendency to depreciate his 
ideal self, and so on. In -addition to these 
three scores, the absolute discrepancies be- 
tween ratings can be summed across the 54 
items yielding three discrepancy scores: the 
Self-Ideal (S-I), Ideal-Average (I-A), and 
Self-Average (S-A) discrepancy scores. Abso- 
lute discrepancy scores are used because alge- 
braic discrepancy scores derived from this in- 
strument have mathematical properties which 
severely limit their usefulness (Worchel, 
1956). 

Description of the ranking procedure. Fol- 
lowing completion of the SAI, each subject 
was given a list of members in his organiza- 
tion and asked to rank each individual in the 
group, including himself, on “personal adjust- 
ment.” The technique was similar to that 
used by Holtzman (1950a; 1950b), and 
Calvin and Holtzman (1953). 

Personal adjustment was defined for the 
subjects as follows: 


An ideal personal adjustment by which you might 
judge would be the person who is free from emo- 
tional problems, who handles personal problems well 
and who rarely shows signs of excessive worry or 


Table 1 


Correlations Among Self Rank, Group Rank, 
and SAI Scores 


Girls 
(N = 50) 


Boys 


SAI 
X Self 
X Ideal 
X Average 
X S-I 





fn 


X 


*x 


A 
A 


— 


SR 





* SigniGcant at the .05 level of confidence. 
** Significant at the .01 level of confidence. 


anxiety. The person functions with a minimal amount 
of conflict and tension in relation to the rest of the 
group and has harmonious interpersonal relation- 
ships. This person seems to be basically happy and 
is liked by others 


This description of personal adjustment spe- 
cifically mentions factors related to both per- 
sonal adjustment and to interpersonal or 
group adjustment. 

The group-ranking score for each individual 
on personal adjustment was the mean of indi- 
vidual ranks assigned to a person by his as- 
sociates. The group-ranking for a member 
thus represented the consensus of his associ- 
ates. The self-ranking represented an indi- 
vidual’s opinion of himself on the adjust- 
ment variable. Thus, two independent esti- 
mates of personal adjustment were obtained. 
A high numerical rank in both instances is 
depreciating, while a low numerical rank is 
enhancing. In addition, the absolute differ- 
ence between group rank (GR) and self rank 
(SR) provided self-group rank (SR-GR) dis- 
crepancy scores. The use of the absolute dif- 
ference score (SR-GR) was justified because 
in this study it is assumed that personal ad- 
justment is in part a function of divergences 
between self and group evaluations of the in- 
dividual. Thus the magnitude of discrepancy, 
rather than direction, is the important con- 
sideration. 


Results and Discussion 


Correlations of SAI variables with self and 
group ranks are presented in Table 1. 

Examination of Table 1 indicates that 
group rank did not correlate significantly with 
any SAI variables. Self rank (SR) and the 
discrepancy between self and group rank 
(SR-GR) were significantly correlated only 
with the S and S-I scores from the SAI. Such 
significant relationships were present more 
frequently for men. However, Fisher’s 2 
transformation applied to pairs of correla- 
tions for males and females revealed that 
none of the scores was significantly different. 

Thus attempts to validate the SAI as a 
predictor of adjustment were uniformly un- 
successful under the conditions of this study. 
The most meaningful and reliable criterion 
of adjustment (GR) showed no significant 
relationship with the SAI scores, nor any pro- 
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nounced trends in this direction. This find- 
ing points to the limitations of the SAI when 
used for this purpose. 

Since SR and S are two independent esti- 
mates of presumably similar aspects of a per- 
son’s self-concept, a positive correlation be- 
tween these two variables was expected. This 
anticipated relationship was obtained with a 
significance level of .01 for males. The .27 
value of r for females very closely approxi- 
mates the .28 required for a significant corre- 
lation at the .05 level of confidence with 48 
df. It was not surprising the S-I was also sig- 
nificantly correlated with SR, since S and S-I 
were not derived independently. This relation- 
ship holds at the .01 level for males and .05 
level for females. 

SR and GR correlated .22 and .21 for 
males and females, respectively. There ap- 
pears to be little relationship between how 
the individual views himself and the group 
consensus of him on personal adjustment. It 
is also interesting to note that the groups of 
both sexes ranked the large majority of indi- 
viduals lower (less well-adjusted) than these 
persons ranked tiemselves. In other words, 
individuals and groups in this college popula- 
tion did not agree on the degree of adjust- 
ment of the individuals, and a prime source 
of this disagreement arose from the fact that 
most subjects saw themselves as better ad- 
justed than did the group. 

Probably the most striking findings of this 
study, however, center around the relation- 
ships between the SR-GR scores and the S 
and S-I variables. The value — .39 was ob- 
tained between the S and SR-GR scores for 
males. This megative relationship suggests 
that if the individual, on the SAI, tends to 
express freely what one may term socially 
disapproved feelings and behavior, his judg- 
ment of his personal adjustment tends to be 
similar to the judgment of the group. Con- 
versely, if he denies such feelings and be- 
havior, his judgment tends to disagree with 
the group consensus. It may be that this de- 
nial is one factor that the individuals in the 
group use in making global evaluations of 
adjustment. However, this inference is tem- 
pered by the fact that the corresponding cor- 
relation for females was not significantly dif- 
ferent from zero. 


Significantly negative correlations between 
SR-GR and S-I of —.51 and —.39 were ob- 
tained for males and females, respectively. 
Thus a small self-ideal discrepancy is asso- 
ciated with a large divergence between self 
and group judgments of adjustment and vice 
versa. At first glance one might conclude that 
this result casts some doubt on the conten- 
tions of many “self” theorists (Lecky, 1951; 
Rogers, 1951; Snygg & Combs, 1949) that 
self-ideal cougruence is positively associated 
with adjustment. It should be remembered 
that whereas self-ideal discrepancy is a meas- 
ure of discrepancy between two aspects of the 
individual’s self-perception, SR-GR is a meas- 
ure of discrepancy between the individual’s 
self-perception and the perception of the in- 
dividual by others. A relatively large or small 
self-ideal discrepancy was found to be asso- 
ciated with disagreement, regardless of direc- 
tion, between the group and individual esti- 
mates of adjustment. Thus a person with a 
small self-ideal discrepancy tends to rank 
himself either considerably above or below 
the group rank for himself. On the other 
hand, the individual who is more openly dis- 
satisfied with himself (large self-ideal dis- 
crepancy) tends to give himself a ranking on 
adjustment appreciably more in accord with 
that of the group. 


Summary and Conclusions 


In attempting to assess the validity of 
the Self-Activity Inventory with a mixed sex 
population, all resident members of a sorority 
and a fraternity were administered the SAI 
and asked to rank themselves, along with all 
members of their respective groups, on “per- 
sonal adjustment.” Scores were obtained for 
each subject on six variables from the SAI 
and were correlated separately for each sex 
with self rank and with pooled group rank. 
Self scores and Self-ideal discrepancy scores 
were correlated with a discrepancy score con- 
sisting of the absolute difference between self 
and pooled group ranking. 

The results indicated that SAI scores were 
not related to the major criterion of adjust- 
ment used in this study, namely, pooled group 
rank. There were no significant differences in 
the magnitudes of the correlations for the 
sexes between SAI scores and either pooled 
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group rankings or the self-rank group-rank 
discrepancy scores. Under the conditions of 
this study the validity of the SAI proved to 
be markedly limited. 

The most provocative finding showed that 
a relatively large or small self-ideal discrep- 
ancy is significantly associated with disagree- 
ment, regardless of direction, between the 
group and individual estimates of adjustment. 
The results also suggested that the individual 
who tends to give free expression to self- 
depreciative attitudes on the SAI is more 
likely to agree with the group judgment of 
his personal adjustment than one who tends 
to deny such attitudes. In general, there was 
little relationship between self and group 
ranks of the individual’s adjustment. How- 
ever, inspection revealed that most people 
ranked themselves above the group evalua- 
tion, i.e., more well-adjusted. 


Received January 9, 1958. 
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University of Michigan 


One of the most penetrating criticisms of 
previous attempts at assessing clinical judg- 
ments has been that the experimental predic- 
tions asked of the clinician differed in class 
or in kind from those made in his day-to-day 
practice. More specifically, the clinician often 
felt that the prediction of success at flight 
school (Holtzman & Sells, 1954), compe- 


tency in clinical psychology (Kelly & Fiske, 
1951), or even outcome of psychotherapy 
(Barron, 1953), called for a range of infer- 
ences beyond those usually demanded of him. 
Some clinicians deny making predictions and, 
instead, characterize their usual work in terms 
of “personality description,” “diagnosis,” “dy- 


namic formulations,” or “understanding the 
patient.” Seen in this light, a good many of 
the published studies of the clinician’s effec- 
tiveness can be dismissed as irrelevant by 
practicing clinicians who see themselves pri- 
marily as diagnosticians or therapists. 

To eliminate these objections and thereby 
place clinicians in their best light, clinical as- 
sessment must: (a) involve problems typi- 
cally encountered by the practicing clinician 
(for example, diagnostic judgments of neuro- 
psychiatric referrals), (5) allow the clinician 
to use his favorite techniques (be they tests, 
interviews, case history material, reports from 
other services, etc.) in his favorite manner of 
utilizing them, and then (c) independently 
validate his conclusions against evidence ac- 


1 The author wishes to express his gratitude to the 
personnel of the Ann Arbor VA Hospital for their 
help in making this study possible. Special thanks 
are due Philip A. Smith of the hospital and E 
Lowell Kelly and Max Hutt of the University of 
Michigan for their encouragement and criticisms of 
this paper 

2 Now at Stanford University. 


ceptable to science as a whole; moreover, this 
entire procedure should be compared with 
similar judgments made by nonpsychologists 
(for example, clerical help). 

Once one has established the kinds of judg- 
ments which clinicians tend to make more 
validly than less-trained personnel, then the 
assessment process can be analyzed segmen- 
tally to see where and how this increased ac- 
curacy comes about. This would involve the 
same considerations mentioned above, with 
one important difference: now the clinician 
would be restricted to one specific instrument 
or one assessment technique. At this stage 
one finds many studies aimed at validating a 
specific technique, yet very rarely do they 
attempt to approximate the criteria already 
considered. For this reason, the question of 
what each diagnostic technique contributes to 
over-all diagnostic competence lies still unan- 
swered.® 

The present paper does not concern itself 
with the value implications of what the cli- 
nician does. Only in the sense that his work 
is compared with less-trained personnel is any 
judgment made of the worth of his labors. 
Whether diagnostic work-ups utilize the cli- 
nician’s time to best advantage, whether he 
should be able to predict overt 
whether he should concentrate on 


behavior, 
research 


8 These comments circumvent two related prob- 
lems: (a) the “clinical-actuarial” controversy and 
(b) the “molar-molecular” controversy. Both issues 
lend themselves to parallel research within the frame- 
work mentioned above. For example, diagnoses ar 
rived at actuarially could be compared with those of 
the clinician and the nonpsychologist. And, diag- 
noses made, using all the clinician’s techniques, could 
be compared with ones made using a portion of 


these techniques or just one “favorite” technique 
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and/or therapy—all such questions are 
omitted here. The assumption has been made 
that clinicians spend a good deal of their 
time in diagnostic testing, and the question 
asked here is, “how good a job are they 
doing?” 

The experimental answer to this last ques- 
tion hinges on two ancillary ones: who is to 
be called a clinician, and what is the nature 
of the independent evidence to test his judg- 
ments. The frequent criticism heard when 
groups of clinicians “fail” in some experi- 
mental study is that they were not “expert” 
enough or that they were essentially “acade- 
micians.” Moreover, the criteria typically uti- 
lized in studies of this kind are all too often 
dubious in nature. Even the broad classifica- 
tions “psychotic,” “neurotic,” “character dis- 
order,” and “normal’—-much less the more 
homogeneous nosological categories such as 
“paranoid schizophrenia,” “obsessive-compul- 
sive neurosis,” etc-——have no commonly ac- 
cepted operational definitions. For this rea- 
son, it is typical to accept as a criterion 
either the consensus of many (the majority 
of whom could be in error) or the judgment 
of an established few (typically psychiatrists, 
who are usually responsible for the diagnostic 
referral to the clinical psychologist in the first 
place). 

In this paper, which reports the first of a 
proposed series of assessments of clinical prac- 
tice, the major concern is with diagnosis 
rather than prediction to later behavior. The 
nosological category chosen for this study was 
“organic brain damage, cortical,” because of 
its inherent criterion for ultimate operational 
definition; namely, the independent diagnosis 
of a competent neurological team.‘ The cli- 
nicians assessed included those who were cur- 
rently practicing as diagnosticians (either 


This statement is not meant to imply that pres- 
ent neurological techniques are as yet able to assess 
the many gradations along the continuum from com- 
plete absence of cortex to perfect cortical health, at 
least short of opening up the skull and examining 
the brain microscopically. Nor is there any implica- 
tion that localization, chronicity, and amount of 
cortical destruction will not influence performance 
on psychological tests. However, if a neurological 
team can separate cases on both extremes of the 
continuum, then these cases might serve as logical 
criteria for the evaluation of psychological techniques 


staff or trainees) at a large VA hospital, and 
their diagnostic performance was compared 
with that of nonpsychologists (hospital secre- 
taries). This study was aimed at the second 
stage of a total assessment project, an ap- 
praisal of the validity of a specific diagnostic 
instrument in the hands of practicing cli- 
nicians. Since the Bender Visual-Motor Ge- 
stalt Test is the most widely used test for or- 
ganic brain damage at the installation under 
consideration (and perhaps at many other 
installations), this instrument was chosen as 
the technique in question. 


Procedure 


Protocols of Bender-Gestalt tests were ran- 
domly selected from the files of a VA general 
medical and surgical hospital. Of these proto- 
cols, those from the first 15 patients who had 
been diagnosed by independent neurological 
examination as showing clear-cut evidence of 
cortical impairment were selected to repre- 
sent patients manifesting organic brain dam- 
age (hereafter termed organics). As the non- 
organic control group (hereafter termed non- 
organics) the protocols of the first 15 patients 
from psychiatric wards were selected where 
(a) psychiatric diagnoses were clearly agreed 
upon, (4) no symptoms usually associated 
with organic brain damage were reported, 
(c) no record of cerebral trauma was noted, 
and (d) any routine examination by the 
neurological staff was negative to cortical im- 
pairment. These latter psychiatric patients 
were all fairly recent admissions to the hos- 
pital at the time they were tested and in gen- 
eral could be characterized as displaying acute 
rather than chronic symptomatology. Table 1 
summarizes certain descriptive variables for 
these 30 patients. 

The 30 Bender protocols were divided into 
three groups of 10 each, such that in Group I 
there were 2 organics and 8 nonorganics, in 
Group II there were 5 of each, and in Group 
III there were 8 organics and 2 nonorganics. 
This was done so as to be able to investigate 
the relationship between frequency of occur- 
rence of a diagnostic entity (base-rate) and 
accuracy of its diagnosis; also, this allowed 
the work for each clinician to be broken down 
into more convenient sections. All protocols 
were assigned a code number which was 





Etiology 
(where available 
Trauma 
Tumor 
Multiple sclerosis 
Thrombosis 
Alvheimer’s disease 
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Table 1 


Description of 


Organics 


; Mean age = 38; Age range 


Localization 
where available) 


Rt. cerebral hemisphere 
Lt. temporal lobe 

Rt. parietal lobe 

Rt. front parietal lobe 
Rt. carotid artery 


Patient Populations 


= 24-61) 


Symptomatology 

(where available) 
Convulsions 
Death 
Chronic brain 

syndrome 
Hemiparesis 
Hemiplegia, spastic 
Headaches 


Nonorganics 
V = 15; Mean age = 32; 
Age range = 23-54 


Final Psychiatric 
Diagnoses 


Paranoid schizophrenia 
Catatonic schizophrenia 
Manic-depressive psychosis 
Character disorder 
Conversion reaction 
Anxiety neurosis 


printed on cardboard and stapled over the 
patient’s name, thus removing this identifying 
data. On the other hand, all descriptions of 
how the patient actually drew the Bender de- 
signs (i.e., arrows, circles, short notes, etc.) 
were left on the protocols. 

Only the actual reproductions of the Bender 
designs were used in this study, since uniform 
elaborations and free associations were not 
available for all the patients. 

Table 2 gives a breakdown of some vari- 
ables concerning the judges who participated 
in this project. All of the psychologists were 
actively engaged in diagnostic evaluations at 
the time the study was conducted, although 
they varied both in their general diagnostic 
testing experience, as well as in their par- 
ticular experience with the Bender. Many 
used the test almost routinely as part of their 
psychological examinations of referred pa- 
tients, while a few used it only rarely. With- 
out exception, the nonprofessional judges had 
had no contact with the technique. 

Each judge was given the three packets of 
10 protocols, one packet at a time in random 
order. Directions were essentially as follows 
for all participants: 

You will be given 30 Bender protocols for your 
diagnostic impressions. For your convenience they 
have been divided into 3 groups of 10 each. Please 
judge each Bender individually, using any system 
you normally apply to such a task. Take all the 
time you want, and feel free to utilize any instru- 
ments (ie., compass, protractor, ruler, etc.) which 
you feel will increase your diagnostic accuracy. 


Obsessive~ ompulsive 


neurosis 


Please do the very best job you possibly can.5 Re 
cord your judgments on the face-sheet attached to 
each packet 

This face-sheet was a mimeographed form, 
listing the patients’ code numbers in a column 
down the left-hand side of the sheet. Along 
the top of the sheet were column headings 
Organic and Nonorganic, as well as a confi- 


Table 2 


Description of Participating Judges 


Approximate 
Experience 

with the 
Level of Bender 
Training in 
Psychology 


Mean 

Group | Age Mean Range 

Psychology 
staff 


Ph.D. plus 
4-10 yrs 


expenence 


6Oyrs. 49 yrs 


Psychology 
trainees 


M.A. plus 
1-4 yrs 


experience 


Nonpsychol None None 


ogists 


5 To increase the judges’ involvement in this diag- 
nostic task, a bottle of Scotch offered to the 
judge who performed most accurately. It was gen- 
erally felt that all of the judges “tried their best.” 
Although the judges spent only 15 to 30 minutes 
diagnosing all 30 patients, in general they expressed 
their impressions that they had been as careful in 
these diagnoses as they would be typically in their 
regular professional evaluations 


was 
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Table 3 
Diagnostic Accuracy by Groups of Judges 








Number of 
Judges 
Differing 
Significantly 
from Chance 
(50%) at the 
05 Level 


Mean 
on 


c 
Group N Correct Range 





Psychology 


staff 65%" 60-70% 1 (25%) 
Psychology 

trainees 7o 60-77% 6 (60%) 
Nonpsychol- 


ogists 5 (62.5% 


All groups 12 (54.5% 


* All mean differences between groups are nonsignificant 


dence rating scale labeled Positively; Fairly 
certain; Think so; Maybe; and Blind guess. 
Instructions on the form were to ¢heck the 
most appropriate diagnostic description for 
each patient and also to indicate one’s de- 
gree of confidence in each of these judgments. 
Thus, each participant made one dichotomous 
diagnostic judgment and then qualified it on 
a five-point scale of confidence. 

For the nonprofessional participants, the 
directions were amplified slightly. Since they 
had no basis at all for their judgments, they 
were administered the Bender as a patient 
would have received it. Then, they were given 
the following directions: 


Now you have an idea of the test itself. Psycholo- 
gists use this test to help them diagnose patients 
with brain damage, on the assumption that brain- 
damaged persons draw the designs differently than 
persons without such damage. If you wish, you may 
use your drawings as a guide to the way a non- 
brain-damaged person responds on the test 


Then followed the same directions given the 
psychologists. 

After the 22 judges had completed the diag- 
nostic task, the Bender reproductions were 
scored by the Pascal-Suttell Objective Index 
(Pascal & Suttell, 1951), for comparison pur- 
poses with the clinical judgments.° 

©The author wishes to thank Ronald Ribler of 


Michigan State University for his help in scoring the 
Bender protocols. 


Results 
Interjudge Analysis 


The most striking finding of this study is 
the complete overlap between groups of 
judges ou diagnostic accuracy (see Table 3). 
Staff psychologists, psychology trainees, and 
nonprofessional persons did not differ from 
each other in their ability to differentiate or- 
ganic from nonorganic patients by means of 
their Bender protocols. The judges’ degree 
of successful diagnoses ranged from 57% to 
77%, with six judges differing at the .01 level 
and six more at the .05 level from statistical 
chance (50%). The remaining ten judges did 
not differ in their diagnostic accuracy from 
that attributable to chance alone. 

While the Pascal-Suttell Objective Index 
(1951) was developed to help differentiate 
psychotic from neurotic populations, it has 
been used for the diagnosis of organics (Bow- 
land & Deabler, 1956). Cutting scores for 
this use of the index vary, and extensive 
normative data is as yet unavailable. If a Z 
score of 100 is used to separate the groups, 
the index accurately diagnoses 63% of the 
patients in this study (Fisher’s Exact: p = 
.04). When the cutting score is lowered to 
90, the percentage of successful diagnoses in- 
creases to 67% (p = .01). The optimum cut- 
ting score for this population seems to be 
around 80, at which point the index diag- 
noses with 80% accuracy (p < .005). As the 
cutting point is lowered below this, accuracy 
slowly falls away (77%, 73%, and 70% for 
Z cutting points of 70, 60, and 50, respec- 
tively); at a cutting score of 40 or below, 
only chance results occur. 

To check on group differences in interjudge 
agreement, the percentage of agreement for 
each group of judges was computed for each 
patient. The average percentage of agreement 
over all 30 patients, for each group of judges, 
ranged from 78% to 85% (chance would pre- 
dict 50%), with no statistically significant 
intergroup differences. 

Although the groups appeared quite similar 
with respect to diagnostic accuracy and in- 
terjudge agreement, there were large differ- 
ences between the groups in the amount of 
confidence they placed in their judgments 
(Kruskal-Wallis H test: » < .01). The non- 
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professional judges—with no training or ex- 
perience in Bender interpretation (and there- 
fore no apparent reason for developing con- 
fidence in the technique)—were, as a group, 
much more confident in their judgments than 
were either the staff or the trainees. And, the 
trainees displayed more confidence in their 
diagnoses than the staff. The results present 
the surprising paradox of an inverse relation- 
ship between the amount of experience with 
the Bender and the degree of confidence 
placed in diagnoses made from it. Moreover, 
there was no relationship between individual 
diagnostic accuracy and degree of confidence! 

To test whether the reduced confidence ex- 
hibited by the more sophisticated judges was 
the result of their finer discrimination be- 
tween the easier and the more difficult judg- 
ments (with the more difficult ones being 
rated with less confidence), the average de- 
gree of confidence for each judge was com- 
puted for those cases he diagnosed correctly 
and again for those cases when he misdiag- 
nosed the patient. The difference between 
these two averages (hereafter termed the in- 
dex of discrimination), when computed for 
each group of judges, was found to show no 
significant intergroup differences. In general, 
judges were about as confident on cases they 
misdiagnosed as they were on those they diag- 
nosed correctly. When the index of discrimi- 
nation was used as the basis for ranking the 
judges, there was found to be no relationship 
between this measure and either total degree 
of confidence or total diagnostic accuracy. 


A further measure was developed as an in- 
dex of clinical judgment,’ under the assump- 
tion that a misdiagnosis which had been given 
with little confidence should not count as 
heavily against the clinician as one which 
was given with great confidence. An index 
was constructed in the following manner: for 
each correct diagnosis qualified by more than 
his median degree of confidence, each judge 
was given two points; for each correct diag- 
nosis given with less than his median degree 
of confidence, one point; for each misdiag- 
nosis with greater than median confidence, 
minus two points; for each misdiagnosis 
qualified by less than median confidence, 
minus one point. When total scores were 
computed for each judge, analysis showed no 
differences between the psychologists, trainees, 
and the nonprofessional judges on this meas- 
ure of clinical judgment. 

An analysis was made of the judges’ tend- 
encies to over-call or under-call organicity in 
the population they were diagnosing. The 
trainees as a group made the most “organic” 
diagnoses, 13.5 per judge, and. they were the 
nearest to judging the actual number of or- 
ganics in the sample. The nonprofessionals 
made the least such diagnoses, only 8.5 per 
judge. These differences almost approach sta- 
tistical significance (Kruskal-Wallis H: p - 
.10), but there was no relationship between 
the number of patients called organic and 
diagnostic accuracy. A measure of the dis- 


7 Suggested by Philip A. Smith of the Ann Arbor 
VA Hospital 


Table 4 


Performance of the Three Groups of Judges on Some Selected Indices 


Interjudge 
Agreement, 
Group Average % 


Psychology staff 
Psychology trainees 
Nonpsychologists 


Totals 


Degree 
of Confi 
dence* > 


2.6 1.1 


Index of 
Clinical 
Judgment® 


Number 
called 
Organic* 


Index of 
Discrimi 
nation® 


16.2 
20.6 
18.8 


19.1 


* Intergroup differences in degree of confidence significant at the .01 level (Kruskal-Wallis H test); all other differences non 


significant. 


» Scored 0 for “blind guess”; 1 for “‘maybe”; 2 for “think so”; 3 for “fairly certain” 
* Scored by subtracting the confidence score on misdiagnoses from that on correct diagnoses 


4 The actual number of organics was 15 
* For method of scoring see text. 


; and 4 for 


“positive 
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Table 5 


Group Performance as a Function of Differing Organic Base-Rates* 








Diagnostic 


Group Accuracy 





Group I (2 organic; 8 nonorganic) 
72% 
62% 
7R¢ A 


Psychology staff — 
Psychology trainees 
Nonpsychologists 


78% 
84% 


c 


Total 70% 84° 


Group IT (5 of each) 


Psychology staff 
Psychology trainees 
Nonpsychologists 68% 


Total 68% 


Group III (8 organic; 2 nonorganic) 


Psychology staff S% 
Psychology trainees W/ 
Nonpsychologists 56% 


Total 66° 


* For explanation of indices, see Table 4 or text. 


> Differences among nonpsychologists at the three base-r: 


variance). 
© Dii¥erences significant at the .01 level (Friedman). 
4 Differences significant at the .05 level (Friedman) 


© Differences between groups significant at the .05 level for e 


Note.—All other differences nonsignificant. 


crepancy between the number of organics 
called and the actual number in the sample 
was computed for each group; neither this 
measure nor the gross number of organics 
called was related to any of the previously 
reported indices. Table 4 summarizes the 
findings on these various measures. 

Since the original material presented to the 
judges consisted of three packets, each con- 
taining different percentages of organic pa- 
tients, the data could be reanalyzed to in- 
vestigate base-rate differences on the vari- 
ables already considered. In this respect, each 
packet can be thought of as a separate study, 
each comparable to one conducted at a dif- 
ferent type of installation (for example, a 
GMS hospital, an NP hospital, a home for 
mentally deficients, etc.). All of the previ- 
ously reported indices (accuracy, interjudge 
agreement, degree of confidence, discrimina- 
tion, clinical judgment, and number of pa- 
tients called organic) were computed for each 


Interjudge 
Agreement 


Number 
called 
Organic 


Confi Discrimi- 


nation 


Clinical 
dence Judgment 





t the .05 level (Friedman two-way analysis of 


Kruskal-Wallis one-way analysis of variance) 


group of judges for each of the three different 
organic base-rates. Table 5 summarizes these 
results. 

The Friedman two-way analysis of vari- 
ance (nonparametric) was run to test differ- 
ences attributable to the differing base-rates. 
The only statistically significant differences 
uncovered were on the “diagnostic accuracy” 
and “number called organic” indices. The 
nonprofessionals were significantly more ac- 
curate in their diagnoses as the number of 
organics in the sample decreased (probably 
because of their consistent tendency to be the 
most sparing in their use of the diagnosis 
“organic”). All groups called more patients 
organic as the actual base-rates of organics in 
the sample increased (although in the case 
of the staff judges, the small size of their 
group prevented statistical significance). 

The Kruskal-Wallis one-way analysis of 
variance (nonparametric) was carried out to 
test if any of the differences between mean 
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group scores, at each base-rate, were statisti- 
cally significant. Just as in the case of the 
over-all analysis, the three groups of judges 
were found to differ significantly only in their 
degree of confidence. 


Interpatient Analysis 


Sometimes the “majority opinion” of a 
group is more accurate than any of the opin- 
ions of its individual members. To check on 
this, a “group” diagnosis, generated by com- 
bining the diagnoses made by all of the 22 
judges, was examined for each patient. The 
amount of agreement between judges ranged 
from complete agreement on one patient (a 
manic-depressive psychotic, correctly diag- 
nosed by everyone as nonorganic) to a com- 
plete 11-11 split in opinion on two patients 
(both organics). There was less than 75% 
agreement on 8 of the 30 patients. 

For those 28 patients on whom there were 
“majority” diagnoses, the group as a whole 
misdiagnosed two nonorganics (both paranoid 
schizophrenics) and five organics (of whom 
three had symptoms which included grand 
mal convulsions, one was a case of posttrau- 
matic encephalopathy, and one had a deadly 
glioblastoma multiforme). Thus, if one counts 
the evenly divided cases as errors (since no 
group diagnosis was generated), the group as 
a whole correctly diagnosed 70% of the pa- 
tients, a figure not significantly different from 
the mean of the individual diagnoses (68%). 

Interestingly, the degree of agreement 
among the judges did not correlate signifi- 
cantly with the accuracy of their combined 
diagnoses. Moreover, majority accuracy did 
not turn out to be related to such variables 
as the chronological age of the patient nor to 
his nosological category. On the other hand, 
since most of the judges tended to underesti- 
mate the actual number of organics in the 
sample, they tended to be most in agreement 
on patients whom they diagnosed nonorganic 
(Fisher’s Exact: p = .013). 

Certain patients must have seemed easier 
to diagnose than others, since there was a 
strong relationship between the amount of 
agreement on a patient’s diagnosis and the 
total pooled confidence. ratings given for this 
diagnosis (Fisher’s Exact: p < .001). Sur- 


prisingly, however, the judges were just as 
confident in rating their incorrect as their 
correct diagnoses; nor were they any less con- 


fident in diagnosing organic than nonorganic 
patients. 


A Subsequent Exploration 


In the course of examining the relationship 
between diagnostic accuracy and experience 
with the Bender, it was noted that the judge 
who performed most accurately was a trainee 
who had spent considerable time (as part of 
his research for a doctoral dissertation) in 
administering, scoring, and interpreting tests 
for organic brain damage with a large group 
of brain-damaged patients. Although a staff 
judge also had considerable Bender experi- 
ence and only performed at the median level 
on this task, one might still wonder whether 
specific intensive experience with the instru- 
ment might not increase diagnostic accuracy. 
In effect, this hypothesis would imply that 
although the practicing diagnosticians were 
not more accurate than nontrained persons on 
this task, real “experts” with the Bender 
could surpass them all. 

To test this hypothesis, one of the coun- 
try’s foremost authorities on the Bender test 
was solicited to take part in this study.* This 
judge, taking some 20 hours to complete the 
diagncstic process, did perform more accu- 
rately than anyone in the original study— 
diagnosing 83% of the patients correctly. His 
scores fell in the middle of the over-all dis- 
tribution on degree of confidence and dis- 
crimination, but he was one of the most ac- 
curate in judging the actual number of or- 
ganics in the sample. He also was at the top 
on the index of clinical judgment. 

Since his performance lends support to the 
“expert” theory on Bender diagnosis, it 
seemed legitimate to combine the scores of 
the top two diagnostic judges in this study 
into a subgroup of Bender “specialists’’ and 
then to reanalyze the data comparing their 
performance on all the indices with that of 
the three other groups. This analysis revealed 
no significant differences on measures other 


8 Special acknowledgment is due Max Hutt of the 
University of Michigan for the time and thought he 
invested in this phase of the study 
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than those of accuracy and clinical judgment 
(Fisher’s Exact: p < .004), on which vari- 
ables the “specialists” were, in effect, selected. 


Discussion 


In general, the results indicate that diag- 
nostic accuracy (when using the Bender to 
diagnose organic brain damage) does not de- 
pend on experience or training in psychology 
(unless, perhaps, that training includes years 
of intensive work with the instrument in ques- 
tion). If he is not a real expert in the use of 
the Bender, a clinician will find that his secre- 
tary can probably do this particular job of 
differential diagnosis as well as himself. More- 
over, she will most likely have considerably 
more confidence in her judgments than he 
would have in his. This makes it all the more 
unfortunate that one’s degree of confidence 
bears no relationship to his diagnostic accu- 
racy on this task! 

Now these results, in themselves, may be 
embarrassing. And, when one considers the 
base-rates of organics usually encountered in 
clinical practice, one might even become 
alarmed. For in most settings the actual base- 
rate is closer to Group I (20% organics) 
than to any of the other groups in this study. 
And it is in this Group I where the nonpro- 
fessionals—by virtue of their tendency to 
label most patients nonorganic—appear most 
likely to overshadow their professional em- 
ployers in diagnostic accuracy (see Table 5). 
However, neither the nonprofessionals nor the 
group as a whole did as well as could have 
been done by merely calling all patients in 
this group “nonorganic” and thereby diagnos- 
ing with 80% accuracy. 

Before judging the Bender too harshly 
however, the following factors should be con- 
sidered: (a) the group as a whole did per- 
form significantly above chance (50%) on 
this task, thus supporting the premise that 
groups of organics do respond differently to 
the Bender test than do groups of nonorganics 
(but when discriminable differences do ap- 
pear they are typically so obvious that al- 
most everyone can detect them); (5) this 
study provided no basis for comparing the 
Bender with other tests for organic brain 
damage in order to see whether the wide- 


spread faith in this technique is indeed com- 
paratively justified; (c) it is of course pos- 
sible that other tests taken in combination 
with the Bender may permit judges to diag- 
nose more accurately; and (d) one cannot 
immediately discount the utility of the cues 
furnished by a face-to-face encounter with 
the patient, cues which were totally absent in 
this study. 

On the other hand, these results might have 
been anticipated on the basis of previous work 
in this area (Bowland & Deabler, 1956). 
For example, Pascal and Suttell (1951), 
while formulating their Objective Index to 
differentiate psychotics from neurotics, have 
this to say about the differential diagnosis of 
organics: 


The Bender-Gestalt. test cannot, in the absence of 
other data, answer that question (is there cortical 
damage?) except occasionally in extreme cases which 
are also clinically apparent (p. 40) 


Performance on the Bender-Gestalt test can indi- 
cate damage to the cortex only when the damage 
shows its effect by pronounced disturbance of the 
ability to execute the test. We know that nine year 
old children can reproduce the designs without 
marked deviation from the stimuli. When, therefore 
an individual is functioning at a maturational level 
of nine years with respect to his ability to repro 
duce the designs, so to speak, we cannot distinguish 
between his deviation and those of individuals suf- 
fering from psychogenic disorders. This fact suggests 
that damage to the cortex has to be rather severe in 
its effect on the functioning efficiency of an adult of 
normal I.Q. before it can be detected by means of 
performance on the Bender-Gestalt test. This fact 
also suggests that actual lesions may exist which 
cannot, on the basis of the deviations noted by us 
be detected in performance on this test (pp. 62-66) 


Nevertheless, since the evidence suggests 
the possibility that real experts in the tech- 
nique may perform with increased diagnostic 
accuracy on this task, it is conceivable that 
they might be able to communicate whatever 
interpretive refinements they possess. In ef- 
fect, this has been tried with the Pascal-Suttell 
Objective Index, which at its optimum cut- 
ting point performed about as well as did the 
best of the individual judges. Significant in 
this connection, however, is the considerably 
greater length of time taken by the top ex- 
pert in making his diagnoses as compared to 
the amount taken by any of the others, and 
correspondingly the greater length of time 
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needed to score records by the Pascal-Suttell 
method. Assuming that real experts—after 
considerable time and careful scrutiny—can 
slightly out-perform hospital secretaries, the 
value of this potential increment in accuracy 
must be carefully evaluated. 


Summary 


Staff psychologists, psychology trainees, and 
nonprofessional (secretarial) judges made 
diagnostic judgments from the Bender proto- 
cols of 15 organic and 15 nonorganic patients 
and then indicated their degree of confidence 
for each of their diagnoses. The three groups 
of judges did not differ in their ability to 
diagnose organic brain damage from the 
Bender, although the nonprofessionals dis- 
played considerably more confidence in their 
judgments than did either of the other groups. 
The Pascal-Suttell Objective Index approxi- 
mately equaled the clinical judgments in diag- 
nostic accuracy, but a renowned Bender ex- 
pert was able to better the diagnoses of the 


practicing clinicians. The group as a whole 
diagnosed above a chance level, but when the 
base-rate of organic patients typically en- 
countered in clinical practice is considered, 
the results suggest that chances for misdiag- 
nosis could be increased by utilizing the 
Bender-Gestalt test 
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CHANGE IN MENTAL ABILITY AS A FUNCTION 
OF TEST ANXIETY AND TYPE OF 
MENTAL TEST’ 
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Zweibelson (1956) reported the results of an 
investigation of the relation between test anx- 
iety in fourth-grade children and their per- 
formance on test-like and game-like group in- 
telligence tests. The three group tests of men- 
tal ability employed in his study, the Otis 
Alpha, Otis Beta, and Davis-Eells tests, were 
described in detail with regard to ‘items, style 
of writing, method of administration, and 
the like. Stronger relationship was found be- 
tween the Otis Alpha test (a test-like instru- 
ment) and the Test Anxiety Scale for Chil- 
dren (TASC) ?—+r = — .28—than between the 
Davis-Eells test (a game-like instrument) and 
the TASC—+r = — .14. While the correlation 
between the Beta test and the TASC (— .24) 
was not significantly different from the corre- 
lation between the Davis-Eells test and the 
TASC (—.14), the Beta-TASC correlation 
was similar in magnitude and direction to the 
Alpha-TASC correlation. These correlations 
suggest that the two Otis tests together are 
different from the Davis-Eells test relative to 
performance on the TASC. Although the cor- 
relation coefficients themselves are not large, 
the correlation differences raise the question 
whether level of test anxiety is a function of 
the type of test being administered. Davis 
and Eells, in their manual (Davis ®& Eells, 


1 This research was supported by a grant from the 
United States Public Health Service. 

2 This scale is described in detail (Sarason et al., 
1958). The TASC has been shortened, subsequent to 
its 1954 administration, from the original length of 
43 items to its present length of 30 items. The form 
employed in the analysis presented in this paper is 
the original form. 
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1956, pp. 4—5), contend that performance on 
their test is less a function of traits like com- 
petitiveness, compulsiveness, and anxiety than 
is performance on other tests which empha- 
size reading ability and speed. Zweibelson 
found some support for their contention in 
his study in which fourth-grade pupils were 
administered the two Otis tests, the Davis- 
Eells Games, and the TASC once only. The 
present paper throws further light on the re- 
lationship between test anxiety and group 
means on these three tests of mental ability. 


In addition, data are presented which show 
the relationship between test anxiety and 
changes in Otis Beta and Davis-Eells scores 
over an interval of two years. 

Our specific interests can be stated in the 
form of three hypotheses: First, the differ- 


ence between low anxious (LA) and high 
anxious (HA) group means should be greater 
on the Otis tests than on the Davis-Eells 
tests, the HA group obtaining lower means 
Second, the increment in mean raw score 
from one year to the next should be greater 
for the LA than for the HA groups on the 
Otis tests. And third, the difference in incre- 
ment between the LA and HA groups should 
be less on the Davis-Eells tests than on the 
Otis tests. Since data for this comparison 
were available for only the Otis Beta test, 
the Otis Alpha scores could not be compared 
with those of the Davis-Eells. 

All three hypotheses stated above follow 
from essentially the same rationale regarding 
test anxiety and test performance. Sarason, 
Davidson, Lighthall, and Waite (1958, p. 
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109) conceptualize test anxiety as “primarily 
interfering in its effect.” Zweibelson (1956) 
qualified this conception. He interpreted his 
data as indicating that the test-anxious reac- 
tion was more interfering in performance on 
a test-like than on a game-like task. In ex- 
ploring available data with regard to the three 
hypotheses stated above we sought further to 
extend the concept of test anxiety. For exam- 
ple we asked, in reference to Hypothesis 2, 
“Does the differential performance of high 
and low test-anxious subjects to different 
types of mental test (test- and game-like) 
hold up also for mean change over time on 
these same types of test?” Our answer as 
stated in the second hypothesis was again a 
reflection of our assumption that test anxiety 
is interfering in its effect, but differentially 
interfering with performance on test-like and 
game-like tests. In testing this hypothesis we 
were testing whether the test-anxious reac- 
tion was interfering with improvement in per- 
formance on these two types of mental test. 
The first hypothesis, in contrast, is simpler 
in its assumptions and follows more directly 
from the implications of Zweibelson’s study, 
namely, that over-all mean performance on 
these two types of test should be differential 
as regards the high and low anxious groups, 
the LA group doing significantly better than 
the HA on the test-like test but not so mark- 
edly superior to the HA on the game-like 
task, where anxiety is assumed to be less in- 
terfering. Hypothesis 3, like the second hy- 
pothesis, relates to change in performance on 
the two types of test and indicates our as- 
sumption that this change should be differ- 
ential as regards the HA and LA groups. That 
is, the scores of the HA‘ should increase sig- 
nificantly less than those of the LA on the 
test-like instrument, whereas the gains of the 
LA and HA should not be significantly dif- 
ferent from each other. Each of these hy- 
potheses assumes that the TASC is effective 
in discriminating the high from the low anx- 
ious over a period of time up to two and a 
half years from the date of TASC adminis- 
tration. 


Subjects 


Subjects were all in the same grade. Ninety- 
two pupils were distributed equally in four 


Table 1 


Sequence of Test Administrations 


Date Grade Date Grade Date Grade 


May, 
1954 


Otis Alpha Jan., 
(N*=108) 1952 


Otis Beta Oct.., 5 ‘ Oct.., 
(N*=92 1954 1956 


Davis-Eeels Oct., Oct., 
(N*=92 1954 1956 


* N here refers to the 1 
from the total tested 
considered to be a f andom sample, it 
the only factor limiti eir number 
on testing days 


mber of Ss finally selected for study 

ation. While these Ss cannot be 
> say that 
was absence from school 


is safe t 


sex-anxiety groups (m = 23) in all but the 
first of three analyses. In that first analysis 
108 pupils, including 83 of the above-men- 
tioned 92, were distributed equally between 
the high and low anxiety groups (m = 54), 
but unequally between the two sex groups, 
boys numbering 62 (31 LA and 31 HA) and 
girls numbering 46 (23 LA and 23 HA). 
Data were available for these Ss because of 
their connection, as a part of a larger sam- 
ple, with a continuing research program. 
These data are not the result of a planned 
execution of a research design but, rather, 
result from an investigation after the fact, so 
to speak, to establish whether certain effects 
found by Zweibelson (1956) are also pres- 
ent in these data, resulting from the regular 
school testing program. 


Procedure 


The school-wide testing program had been 
carried out in the sequence shown in Table 1. 
The TASC was administered to these Ss in 
May, 1954, when they were in the fourth 
grade. Unlike Zweibelson’s study, the LA and 
HA groups in the present study were not 
taken from the extremes of the distribution of 
test anxiety scores but were grouped relative 
to their position above or below the sample 
median. Separate medians were computed for 
boys and girls. Analysis of variance designs 
were used throughout, with both between and 
within subject effects. In comparing scores of 
two differeni tests standard scores were com- 
puted. Analysis of gain scores in the Otis 
Beta-Davis-Eells comparison was based on 
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the standardized differences of each individu- 
al’s raw score on the first administration of 
a test subtracted from his raw score on that 
test administered two years later. 


Results 


The first analysis, involving the TASC and 
the Otis Alpha test, showed no anxiety, sex, 
or anxiety-sex interaction effects either on 
over-all Otis Alpha means of the two ad- 
ministrations combined or on gain in mean 
from the first to the second administration. 

The second analysis, involving the TASC 
and the Otis Beta test and in which there 
was a 77% overlap with Ss employed in the 
first analysis, showed the only clearly signifi- 
cant finding (p < .01) to be a simple main 
effect of sex grouping on the over-all Beta 
mean (collapsing the three sets of scores from 
the three Beta administrations into one set 
of scores). The girls received a higher over-all 
mean than the boys. In addition, this mean 
was different for the two anxiety groups 
between the .10 and .20 level, the HA do- 
ing poorer than the LA groups, as expected. 
There was also a difference (.10 < p < .20) 
in the rate of gain in Beta score over the 
three years for the two anxiety groups, the 
LA gaining the more rapidly and augmenting 


Table 2 
Summary of Otis Beta and Davis-Eells Raw Score 
and Gain Score Analyses 


(mn = 23, N = 92) 


Mean 
Square 
Raw Score 


Mean 
Square 
Gain Score 


Between 
Anxiety 3,213.91** 
Sex 364.57 
Ax<s 219.58 


Error 450.05 


Within 
Test 
TXA 
Txs 
TXAxXS 
Error 


16,209.39" 
66.98 
147.96 
151.56 
163.43 


*05<p < .10. 

>» < O1. 

* Significant but trivial, since standardized scores were not 
used and tests contained different numbers of items 


the difference between the HA and LA means. 
Sex was not found to affect the rate of gain 
on Beta score. 

The third analysis, with the same Ss as in 
the second, concerned the differential rela- 
tionship of test anxiety to Davis-Eells scores 
compared with Otis Beta scores. This analy- 
sis revealed a significant effect of anxiety 
level on mean mental ability score (p < .01) 
which was uncomplicated by interaction of 
test (Otis Beta or Davis-Eells) with anxiety. 
As to the change of scores on the two tests 
over two years, there was no over-all effect 
of anxiety on gain but rather an anxiety-test 
interaction effect, as hypothesized (.05 < p 
< .10). This effect, while not robust, re- 
vealed that while the LA gained more over 
two years than HA on the Beta (test-like) 
test, the HA gained more than the LA on the 
Davis-Eells (game-like) test. The difference 
between LA and HA was more marked on the 
Beta test than on the Davis-Eells test, how- 
ever. The third -analysis, including mean 
squares for both over-all mean effects and for 
effects of change, is summarized in Table 2. 

The reader will note that the within Ss and 
between Ss error mean squares for (standard- 
ized) gain scores are of virtually equal magni- 
tude. This indicates that there is as much 
variation, on the average, from an individu- 
al’s gain on the Otis to his gain on the Davis- 
Eells as there is from one individual’s over- 
all gain to another’s over-all gain. This is in 
contrast to the within and between error 
terms, also appearing in Table 2, in the analy- 
sis of raw scores combined over the two years. 
In other words, there is no correlation be- 
tween gains on the Otis and gains on the 
Davis-Eells over this two-year period, even 
though there is a correlation between over-all 
raw scores on the two tests as indicated by 
the small magnitude of the within Ss error 
term for raw scores (163.43) -relative to the 
comparable error term between Ss (450.05). 
Pooled within groups correlations between 
Otis and Davis-Eells scores from first and 
second administrations of the tests, respec- 
tively, are .46 and .59. 


Discussion 
The first hypothesis stands unconfirmed: 
over-all mean performance on the two Otis 
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tests is not found to be related to test anx- 
iety in any way different from that in which 
the over-all mean on the Davis-Eells test is 
related to test anxiety. The second hypothe- 
sis receives support from the analysis of Otis 
Beta increments over three years, but not 
from analysis of Otis Alpha increments over 
two (earlier) years. Alpha gains from the 
second to the fourth grade are not signifi- 
cantly related to (fourth grade) test anxiety 
grouping, while Beta gains from the fifth to 
the seventh grade are somewhat (.10 < p < 
.20) related to (fourth grade) test anxiety 
grouping. The third hypothesis is confirmed 
Furthermore, not only is the magnitude of the 
difference between the gain scores of the HA 
and LA groups less on the Davis-Eells test, 
but, in comparison to the LA, the HA show 
greater improvement on the Davis-Eells and 
less improvement on the Otis Beta. 

Various interpretations may be offered for 
the lack of correlation between gain on the 
Otis Beta and gain on the Davis-Eells tests. 
First, it is well known that error in gain 
scores is greater than error of measurement 
in either the first or second measurement 
taken separately. Thus, unreliability might 
appear to be suppressing the correlation be- 
tween the two gain scores. A valid objection 
to this argument would seem to be that the 
gain scores were reliable enough to yield a 
within Ss effect (test-anxiety interaction) sig- 
nificant between the .05 and ..10 level and 
therefore should be reliable enough to allow 
for correlation with each other if such rela- 
tion did in fact exist. Another explanation for 
the lack of relationship between gains on the 
two tests is that the sample employed might 
be peculiar in this regard. While it is true 
that the sample was not drawn randomly from 
an unselected group of pupils, there still re- 
mains the discrepancy between correlations, 
46 and .59, of raw Beta and raw Devis-Eells 
scores, for first and second testings on the one 
hand, and the correlation between gain scores 
(zero), on the other hand. A third explanation 
for the lack of correlation, with severe impli- 
cations for psychological measurement, is that 
improvement on these two tests, the Otis 
Beta and Davis-Eells, requires quite inde- 
pendent types of growth, even though they 
both purport to measure intelligence. If this 


is true, it lends strong support to Davis’ and 
Eells’ contention that they are tapping a dif- 
ferent domain from those measured by most 
group-administered mental tests, of which the 
Otis is in many respects representative. If 
gains on group-administered mental ability 
tests are important, and if the results pre- 
sented here on gains are at all representative, 
then the choice of which test and/or type of 
test (examination- or game-like) to employ is 
extremely important. Additional research on 
the reliability and correlates of the gain scores 
on the Otis Beta and Davis-Eells tests is 
necessary before unequivocal interpretation of 
longitudinal results with these tests is pos- 
sible. The assumption that the TASC is ef- 
fective in discriminating the high from the 
low test-anxious over a period of time up to 
two and one-half years appears to have valid 
foundation, provided that this period is be- 
tween the fourth and seventh grades. 

While the first hypothesis was not borne 
out and the second received only partial sup 
port, it must be emphasized that the basis of 
grouping Ss in anxiety levels was their po- 
sition relative to the median and not mem 
bership in extreme groups, as in the Zweibel- 
son study. This fact tends to point up the 
significance of the support for the third hy- 
pothesis. However, as this was an exploratory 
study, its main significance lies in revealing 
general trends and providing suggestions for 
further research. Two general trends are 
shown: (a) mental ability scores tend to 
change from fifth to seventh grade differen- 
tially for high and low test anxiety groups 
and (5) this change is related to the type of 
test used in measuring mental ability. Fur- 
ther research suggested by these results con- 
cerns (a) the interaction of grade with the 
effect of anxiety on change in mental ability 
scores (e.g., why is there no effect of anxiety 
on change from the second to the fourth 
grade when there is one from the fifth to the 
seventh grade?) and (6) the specific nature 
of the effect of different types of mental abil- 
ity test on the relation between anxiety and 
performance on these tests. For example, does 
the difference in HA and LA performance on 
the Otis Beta from that on the Davis-Eells 
test show that where no reading ability is 
needed (Davis-Eells) anxiety is not inhibitory 
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and even possibly facilitating in effect on per- 
formance? The above-mentioned trends have 
at least two implications. First, the earlier the 
anxiety level of school children can be identi- 
fied, the earlier they can be provided with a 
therapeutic program designed to help them 
recognize the interaction between their anx- 
iety and various types of situations, and to 
learn to capitalize on the (possibly) facilitat- 
ing effects of anxiety while minimizing its in- 
terfering effects. Second, the interpretation of 
change in mental ability must take into ac- 
count variations in type of measuring in- 
strument such as Zweibelson has enumerated 
(1956, Table 1). 


Summary 


Three analyses were carried out on data 
available from a regular school testing pro- 
gram. Each analysis concerned the relation of 
performance on the Test Anxiety Scale for 
Children to performance on three mental abil- 
ity tests described and employed by Zweibel- 
son (1956). The effect of test anxiety group- 
ing (above and below median) on both over- 
all mean and change in mean over a dura- 


Sarason, and I. Zweibelson 


tion of two years was tested with reference 
to three hypotheses: 1. The difference be- 
tween low anxious (LA) and high anxious 
(HA) over-all means should be greater on 
the Otis (test-like) tests than on the Davis- 
Eells (game-like) test. 2. The mean incre- 
ment from one year to the next should be 
greater for the LA than the HA on the Otis 
tests. 3. The difference in increment for LA 
and HA should be less on the Davis-Eells test 
than on the Otis Beta test. The first hypothe- 
sis was not supported, while the second re- 
ceived partial support. The third hypothesis 
was supported. Implications both for practice 
and further research were discussed. 
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THE VALIDITY OF THE SPIRAL AFTEREFFECT 
AS A CLINICAL TOOL FOR DIAGNOSIS 
OF ORGANIC BRAIN PATHOLOGY ' 


EMILY B. PHILBRICK 


University of California, Berkeley 


Many of the cases that are referred to the 
psychology staff in a general hospital are 
ones in which the neurologists wish additional 
information in their attempt to determine 
whether or not organic brain pathology exists. 
Although psychological tests such as the 
Bender-Gestalt, WAIS, and Rorschach are 
useful in such a diagnosis, these tests are 
far from adequate (Armitage, 1946; Yates, 
1954). More clinical tools are vitally needed 
and so any procedure needs to be thoroughly 
investigated that reportedly will detect or- 
ganic brain pathology. 

Price and Deabler (1955) and Farrett, 
Price, and Deabler (1957) reported that 
96% to 98% of cases with organic brain 
syndrome could be picked up by the Archi- 
medes Spiral test, while Gallese (1956), with 
the same apparatus and a somewhat altered 
procedure and sample, reported that 66% 
of such cases could be correctly diagnosed. 
These reports appeared to be most encour- 
aging and a cross validation study of the 
work of Price and Deabler was undertaken. 
In addition, the spiral test was run after giv- 
ing most patients the Weinstein sodium amy- 
tal test (Weinstein & Malitz, 1954). This 


1 This research was carried out at the Veterans 
Administration Hospital, San Francisco, Calif., upon 
the suggestion of Jerome Fisher. The author wishes 
also to thank Stuart MacRobbie, who administered 
the sodium amytal injections and did the neurologic 
classification, and Henry W. Neuman and Lewis A 
Roberts, who also aided in the neurologic classifica- 
tion. David A. Rodgers helped by reading critically 
the first draft of this manuscript. The manuscript 
was prepared while the author held a national fel- 
lowship awarded by the American of 
University Women 

The author is now at College 
Calif. 
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Weinstein test presumably increases certain 
organic symptoms when diffuse brain pathol- 
ogy exists. Therefore, it seemed important to 
study its effect on the perception of the spiral 
aftereffect. 


Method 


Eighty-one patients, essentially consecutive 
admissions, from the neurology ward of this 
general hospital were given the Archimedes 
spiral aftereffect test. Only those physically 
unable to leave their beds, or with gross eye 
defects, were omitted from the testing. The 
apparatus and procedure used corresponded 
to that reported by Price and Deabler (1955). 

An electric motor commonly used for color 
mixing experiments was used to rotate a white 
8” disk on which was painted a black Archi- 
medes spiral of 920° or 24 circuits about the 
center. The apparatus was so constructed that 
the disk could be rotated in either direction. 
The disk was rotated at approximately 100 
rpm. 

The patient was seated 8 ft. from the ap- 
paratus. Testing was conducted at noon when 
natural illumination was good. The patient 
was informed that this was a special eye test. 
He was asked to keep his eyes on the center 
of the spiral and was reminded of this at in- 
tervals during the testing period. In half of 
the cases the spiral was rotated first to give a 
negative aftereffect of expansion (Spiral A), 
while in the other half the rotation was in 
the opposite direction first, giving a negative 
aftereffect of contraction (Spiral B). After 10 
seconds of rotation the patient was asked: 
“What does the line seem to be doing?” At 
the end of 30 seconds the rotation was stopped 
and the patient was asked immediately: “Now 
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what does the line seem to be doing?” The 
patient’s answer was recorded verbatim. Each 
patient was shown the spiral four times, in 
the order A-B-A-B or B-A-B-A. 

The responses were scored in two different 
ways: once following the method outlined by 
Price and Deabler (1955) and Farrett et al. 
(1957), who used half scores as well as whole 
scores, and once following Gallese (1956), 
who used only whole scores. Price and Deab- 
ler (1955) state: 


Each normal report, that is, seeing the negative 
aftereffect correctly, was scored 1. Each abnormal 
report, that is, failure to perceive the negative after- 
effect, was scored 0. Reports of aftereffect in which 
no apparent change of dimension occurred but in 
which apparent motion forward or backward was 
reported, were given a score of 4. . . . The Ss scored 
a total of 4, 3, 2, 1, or 0 depending on the degree 
of their perception of the aftereffect. . . . In the in- 
terests of conservatism, to give S the benefit of a 
doubtful response, fractional or half scores were 
raised to whole scores in final computation (p. 300). 


Gallese (1956) does not distinguish between 
reports of change of dimension and apparent 
motion forwards or backwards and gives a 
full score of 1 to either type of response. 

After the spiral test had been run, a Wein- 
stein amytal test (Weinstein & Malitz, 1954) 
was run on 53 of the 81 patients, and this 
was then followed by another complete pres- 
entation of the spiral aftereffect test. The 
Weinstein test is designed to diagnose diffuse 
brain pathology and consists of asking the 
patient the same 21 questions before and 
after an injection of sodium amytal. Any 
changes in the direction of concretism, lack 
of orientation, etc. in the answers are noted. 
The hospital neurologist interested in check- 
ing this test was present throughout the spiral 
and amytal tests, and administered the drug. 
The amytal was given during a 10-minute 
period with the amount given varying in each 
case. The drug was administered until the 
patient showed signs of visual nystagmus, 
blurring of speech, drowsiness, and counting 
errors (backwards from 100 by ones). 

The exact action of sodium amytal is not 
known. One theory states it may act to in- 
hibit some enzyme system or cell metabolism, 
thus preventing the cells from functioning so 
rapidly. Unlike phenobarbital and other simi- 
lar drugs, it not only affects higher centers 
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but acts more specifically, and to a greater 
degree, on subcortical areas concerned with 
associations between cortical lobes and di- 
encephalic centers. Because of this action, 
clinical manifestations of organic damage are 
augmented, and the spiral test was rerun as 
it seemed possible that the ability to see the 
negative spiral aftereffect might be further 
impaired when organic brain pathology was 
present. 

At the time of working with these patients 
they had not yet been diagnosed as having 
an organic brain syndrome. It was not until 
all the testing had been completed that at 
least two neurologists rated each patient as 
“organic” or “nonorganic” on the basis of 
discharge summaries and total case material. 
A patient was placed in the “organic” cate- 
gory if he was considered to have organic pa- 
thology above the foramen magnum. On the 
other hand, if the neurologists considered the 
patient not to have organic pathology or to 
have organic pathology below the foramen 
magnum, he was placed in the “nonorganic” 
category. There were nine cases on which 
there was not agreement, and these were not 
considered as part of the final sample. Thus 
the results were computed on a sample of 72 
cases, and on 53 cases in which amytal had 
been given. Forty-five cases of the 72 were 
considered to be organic. These included such 
disease entities as multiple sclerosis, brain 
tumors, Parkinsonism, epilepsy, cortical atro- 
phy, etc. 

Price and Deabler (1955) and Farrett et al. 
(1957) refer to their organics as having corti- 
cal involvement. Nowhere do they state ex- 
actly what they mean by this. Since their 
sample includes cases of Parkinsonism and 
cerebral vescular accidents, and these are 
commonly not considered as usually involving 
the cortex directly, it may be inferred that 
“cortical involvement” is used loosely to mean 
“brain involvement.” Therefore, the sample 
of organics used in this study may be con- 
sidered comparable to those in the former 
sample, with the addition of epileptics in this 
sample 


Results 


As was previously stated, the data were 
analyzed using the scoring system reported by 
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Price and Deabler (1955) and Farrett et al. 
(1957) and that used by Gallese (1956). No 
rationale for using half scores is given by the 
former authors. Goldstein (1948) and other 
investigators consider that organic brain pa- 
thology leads to impairment of abstract abil- 
ity, with a concreteness of thinking resulting. 
Possibly Price and Deabler are using this 
concept and considering the response of 
change of dimension as being more abstract 
than movement forward or backward and, 
hence, the superior response. It would seem 
that age, IQ, etc. might be important fac- 
tors, rather than simply organicity, in deter- 
mining the verbal response. Thus the validity 
of considering the form of response as diag- 
nostic has been checked here by scoring the 
records with and without this differentiation. 
Price and Deabler are extremely vague in 
their description of their method of analyzing 
their data. Gallese considers scores of 0-2 as 
organic, and 3-4 as nonorganic. This, then, 
is the cutting point used here. Chi square was 
the statistic used to test the significance of 
the results. 

As can easily be seen from the tables, these 
results differ greatly from those reported 
previously by Price and Deabler and Gallese. 

Table 1 the results following the 
procedure and analysis used by Price and 
Deabler. Under the conditions of this study 
there is no differentiation 
nonorganics by 


gives 


from 
test. 
these 


of organics 
the spiral aftereffect 
Several things probably account for 


Table 1 


Discrimination Value of Spiral Aftereffect Tests 
Using Price and Deabler’s Scoring System 


Before Amytal) 


Neurological Classification 


Organic Nonorganic 
Organic scores 

No. of patient 24 21 

% of patients ( 46.7°, 
Nonorganic scores 3-4 

No. of patients 


Table 2 
Discrimination Value of Spiral Aftereffect Tests 
Using Gallese Scoring System 


Before Amytal 


Spiral Criterion 
Neurological Classification 
Gallese Scoring 


(No 1/2 scores Organic Nonorgani 


Organic scores 0-2 
No. of patier ts 


‘ 


o Of patients 


Nonorgani scores 3 4 
No. of patients 


( 


© of patients 


Note.—Chi s 


surprisingly discrepant results. Subjects who 
fall at extreme ends of a distribution as they 
do in the former studies (organics vs. nor- 
mals) may often be correctly diagnosed, while 
the same test may fail with individuals fall- 
ing closer to the center of the distribution. 
There are no normal cases in this study com- 
parable to those used in the former work 
However, as Fisher, Gonda, and Little (1955) 
and Stilson, Gynther, and Gertz (1957) have 
emphasized, the sample used here is the 
meaningful one for testing the efficacy of a 


clinical tool. Previously, the normal popula- 


tion had been drawn from hospital person- 
nel. Here, all the patients had neurologic 
symptoms that had to be evaluated. Some 
later were considered on the basis of neuro- 
logical (examination, EEG, pneumo- 
encephalogram, etc.) to have organic brain 
pathology above the foramen magnum, and 
some were not. It is on such a sample that a 
clinical tool of this sort must prove its use- 
fulness. 

Furthermore, at the time of testing, the 
final diagnosis of the patient was unknown 
It is known this aids objectivity and lessens 


tests 


the danger of maximizing chance differences 
on the test between known groups. In the 
other studies, the investigators were working 
with known groups and could analyze their 
data with this in mind. 

Half scores are not used in obtaining the 
results reported in Table 2. The results still 
do not show significance. Fewer organics are 
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diagnosed correctly by this method than by 
the use of 4 scores. However, the number of 
false positives is considerably reduced. For 


this reason it seems preferable to use the 


scoring system of Gallese who does not em- 
ploy 4 scores. 

The results shown in Table 3 indicate that 
the amytal injection does make it more diffi- 
cult to perceive the spiral aftereffect. Of the 
organics, 78.8% are correctly picked up. 
However, as there are 65.0% false positives, 
this procedure is useless as a diagnostic tool. 

The results reported in Table 4 are the only 
significant ones. Half scores were not used 
here. Once again the indication is that the 
use of 4 scores is a poorer method of dif- 
ferentiation than is the use of whole scores. 
As in Table 2, there is a pick-up of 66.7% 
of the organics, with only 35.0% false posi- 
tives. Although these results are significant 
(p .05), it is questionable whether the dif- 
ferentiation is clear-cut enough to warrant 
the patient’s being subjected to this rather 
lengthy and disagreeable procedure at this 
time. Additional work is needed to determine 
the validity of this procedure and to give an 
indication of its usefulness as a clinical tool. 

Under none of these conditions do the re- 
sults in any way duplicate the work previ- 
ously done with the spiral aftereffect test. 
Therefore, at the present time it seems that 
this test is not a useful clinical tool to deter- 
mine the presence of organic brain pathology 
in a general hospital population. 


Table 3 


Discrimination Value of Spiral Aftereffect Tests 
Using Price and Deabler’s Scoring Systen 
(After Amytal 


Spiral Criterion 
—_——— Neurological Classification 
Price and Deabler 

(1/2 scores) 


Organic Nonorganic 


Organic scores 0-2 
No. of patients 26 
% of patients 78.8% 


Nonorganic scores 3-4 
No. of patients 13 
% of patients 65.0% 


Note.—Chi square, ? >. 
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Table 4 
Discrimination Value of Spiral Aftereffect Tests 
Using Gallese Scoring System 


(After Amytal) 


Spiral Criterion 

Neurological Classification 
Gallese Scoring 
(No 1/2 


scores Nonorgani 


Organi 


Organic scores 0-2 
No. of 


c 


patients 


© of patients 


Nonorganic 
No. of 


scores 3-4 


patients 


It was often noted that the organic pa- 
tients who saw the aftereffect seemed to no- 
tice it for a shorter duration of time than did 
those without brain pathology. This was also 
noted by Gallese. A study is now in progress 
to determine whether the duration of the 
aftereffect will differentiate organics from 
nonorganics in this general hospital group. 
Although such procedures were not included 
in the present study, it might be valuable to 
study the effect of increasing the difficulty of 
seeing the aftereffect by reduced lighting, 
lowered speed of rotation, etc., rather than 
by the arduous process of administering 
sodium amytal 


Summary 


1. Eighty-one patients admitted to the neu- 
rology ward of a Veterans Administration 
general hospital were given the Archimedes 
Spiral Aftereffect test. This was administered 
in the manner reported by Price and Deabler 
(1955) and Farrett et al. (1957), and scored 
according to their method and also by that 
of Gallese (1956). 

2. Fifty-three patients received the Wein- 
stein amytal test after being given the spiral 
test and were then retested on the spiral. 

3. The diagnostic category into which the 
patients fell was unknown at the time of test- 
ing. After the sample had been run, at least 
two neurologists rated each patient as hav- 
ing, or not having, organic brain pathology 
above the foramen magnum. There was dis- 





Validity of Spiral Aftereffect 43 


agreement in nine cases, so these were omitted 
from the analysis. 

4. Analysis of the results indicates no sig- 
nificant differentiation, except after the amy- 
tal injection using Gallese’s scoring system. 
However, the presence of 35% false positives 
with this rather involved procedure discour- 
ages its use as a Clinical tool, at least until 
further research has purified the procedure 
and validated its usefulness. Thus it appears 
that in its present form the spiral aftereffect 
test is not useful for diagnosing organic brain 
pathology in a general hospital setting 


Received January 20, 1958 


REFERENCES 


Armirace, S. G. An analysis of certain psychological 
tests used for the evaluation of brain injury. Psy- 


chol. Monogr., 1946, 60, 1-48 


Farrett, E. S., Price, A. C., & Deasrer, H. Z. Diag- 
nostic testing for cortical brain impairment. A.M.A 
Arch. Neurol. Psychiat., 1957, 77, 223-225 

Fisuer, J.. Gonna, T., & Littre, K. B. The Ror- 
schach and central nervous system pathology: A 
cross validation dmer. J. Psychiat., 1955, 
111, 487-492 

Gattese, A. J. Spiral aftereffect as a test of organic 
brain damage. J. clin. Psychol., 1956, 12, 254-258 

Gotpstetn, K. After-effects of brain injuries in war 
New York: Grune & Stratton, 1948 

Price, A. C., & Deasrer, H. Z 
ganicity by means of spiral aftereffect. J 
Psychol., 1955, 19, 299 

Stitson, D. W., Gyntuer, M. D., & Gertz, B 
rate and the Archimedes spiral illusion. J 
Psychol., 1957, 21, 435-437 

Wetnstern, E. A., & Matiz, $ 
expression with amytal sodium 
1954, 111, 198-206 

Yates, A. J. The validity 
of brain damage. Psych 


study 


Diagnosis of or- 
consult 


Base 


consult 


Changes in symbolic 
imer. J. Psychiat.. 
of some psychological tests 


1. Bull., 1954, $1, 359-379 





Journal of Consulting Psychology 
Vol. 23, No. 1, 1959 


THE YOUNG REBEL: 
SELF-REGARD AND EGO-IDEAL * 


EVA MARIA SHIPPEE-BLUM 


San Mateo Community Hospital, California 


This study was undertaken as a part of a 
larger investigation of the ego-structure of 
rebellious adolescents (Shippee, 1954). For 
this purpose, the psychoanalytical frame of 
reference was adopted since it provides a con- 
venient conceptualization of underlying be- 
havior determinants in terms of the construct 
of ego, varying on a continuum of ego-weak- 
ness to ego-strength. The primary hypothesis 
to be tested was that young rebels are en- 
dowed with less ego-strength than their more 
cooperative agemates. The deficiency in ego- 
strength was assumed to result in impaired 
functioning in reality adaptation for which 
an adequate ego is a prerequisite (Redl & 
Wineman, 1952). 

Freudian theory accords the ego the role of 
mediator between the demands of successful 
survival in the world of reality and the de- 
mands of gratifying the inner world of in- 
stinctual impulses. Such a “reality orienta- 
tion” depends upon an ego str-ng enough to 
tolerate unpleasant inner tensions until the 
time when instinctual needs can be appropri- 
ately satisfied without harm to the individual 
(or to others). Ego-strength refers to the abil- 
ity to withstand tension without resorting to 
various defense mechanisms such as distor- 
tion, repression, displacement, uninhibited at- 
tack or flight. Such an ego should be able to 
make use of and enjoy opportunities maxi- 
mally without undue anxiety, guilt, avoid- 
ance, etc. 

The connection between “reality orienta- 

1 This investigation was supported in part by a 
research grant (M-528) from the National Institute 
of Mental Health of the National Institutes of 
Health, Public Health Service. The study was car- 
ried out at Stanford University. Jeanne Block, Bar- 


clay Martin, and Joseph Luft collaborated in this 
research. 


tion” and “ego” is described by Fenichel 
(1954, p. 35): “The origin of the ego and 
the origin of the sense of reality are but two 
aspects of one developmental step.” 

The sense of reality, also designated as the 
“reality principle” in contra-distinction to the 
“pleasure principle,” implies, as indicated 
above, the ability to wait, to postpone im- 
pulse gratification. A temporal delay is in- 
terpolated between the original stimulus-re- 
sponse unit typical for infant behavior. The 
long-circuiting of immediate reflex-type dis- 
charge phenomena is made possible by the 
emergence of intervening symbolic processes, 
i.e., the emergence of the ego. Fenichel (1954, 
p. 367 ff.) writes as follows: 

The infant as long as he acts according to the pleas 
ure principle tries to discharge tensions immediately, 
and experiences any excitement as “trauma,” which 
is answered by uncoordinated discharge movements 
They [impulse neurotics, ie., delinquents, etc 


still act as if any tensions were a dangerous “trauma 


As mentioned before, in this study we are 
concerned with the ego’s attribute of 
strength. In order to pass from the level of 
abstraction to the behavioral aspects of ego- 
strength, Block’s (1950) intervening variable 
“ego-control” is adopted in this investigation 
“Ego-control” is to provide the transitional 
step from ego-strength as manifested in ten- 
sion-binding-capacity,? to the experimental 
measure thereof. The ego-control system, ac- 
cording to Block, functions to maintain the 
most subjectively efficient balance for the in- 
dividual between immediate gratification of 
needs and delay of gratification congruent 
with the requirements of the external world. 


ego- 


2 Tension-binding-capacity as used here is the tech- 
nical equivalent for the more descriptive terms: ten 
sion-tolerance, frustration-tolerance, impulse-control 
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He states that the ratio—Immediate Gratifi- 
cation/Delay of Gratification (IG/DG)— 
tends toward maximization as a consequence 
of the hedonic postulate which most person- 
ality theories accept. Block terms individuals 
who more or less maintain an optimal IG/DG 
ratio as Adequate Controllers. This designa- 
tion corresponds to what the present writer 
refers to as optimal ego-strength. Block dis- 
tinguishes between the Adequate Controller 
and the Under Controller; the latter is one 
whose ego-control system has insufficiently 
developed the necessary tension tolerance. 
His ID/DG ratio is above the optimal range. 
We recognize in the individual with low ten- 
sion tolerance one of the aspects of impaired 
ego-strength discussed previously as a factor 
in rebelliousness. 

In this study the major hypothesis of de- 
creased ego-strength in the rebellious adoles- 
cent was tested by comparing the tension- 
binding-capacity of a group of rebellious 
adolescents to an equated group of coopera- 
tive adolescents. As an operational definition 
of low tension-binding we adopted Block’s 
(1950) inferred variable “under-control” and 
his IG/DG measure thereof which consisted 
of a questionnaire designed to assess under- 
control. The prediction which follows is: On 
an ego-control questionnaire designed to meas- 
ure Impulse Gratification vs. Delay of Gratifi- 
cation, the rebel will be on the under-con- 
trolling end of the distribution of scores. 

Two subsidiary hypotheses were derived 
from the primary concept of ego-weakness in 
the rebel. The first hypothesis was that the 
rebel’s reality testing is inadequate. Spe- 
cifically, it was predicted that one important 
aspect of reality testing is distorted, namely, 
the objective evaluation of the rebellious ado- 
lescent’s own self. Indeed, a realistic self- 
evaluation is only possible if the individual 
is able to tolerate the frustration arising from 
an awareness of his limits. Such a frustration 
tolerance requires a strong ego. Consequently, 
a person lacking the prerequisite ego-strength, 
as exemplified by the rebellious adolescents, 
will show low frustration tolerance. Low frus- 
tration tolerance may manifest itself in vari- 
ous ways; for instance, in a narcissistic ori- 
entation toward the self. 


Following Block (1950), it was posited 
that self-evaluation varies along a continuum 
ranging from unrealistically favorable to «an- 
realistically unfavorable. Present evidence in- 
dicates that a realistic self-regard implies 
moderate self-esteem. A narcissistic self-re- 
gard, on the other hand, can manifest itself 
either as feelings of grandiose self-esteem or 
in the converse, as feelings of exaggerated in- 
feriority. 

In the present study, the self-regarding at- 
titudes of a group of rebellious adolescents 
were compared to the self-regarding attitudes 
of an equated group of cooperative adoles- 
cents. Realistic self-regard was defined as 
moderate self-esteem. It was measured opera- 
tionally by scores clustering around the mean 
of the distribution obtained on an instru- 
ment designed to measure self-esteem. 

Narcissistic self-regard was defined as im- 
moderately high self-esteem or extremely low 
self-esteem. It was measured by extreme 
scores on either the high or the low end of 
the distribution of scores obtained by means 
of an instrument designed to measure self- 
esteem. 

The hypothesis of impaired ego-strength in 
the rebellious adolescent allowed the deduc- 
tion that they will exhibit narcissistic atti- 
tudes toward themselves. Such attitudes were 
defined both as immoderately high self-esteem 
and as extreme feelings of inferiority. From 
this a second prediction follows: On an adjec- 
tive list, checked for socially valued attitudes 
of the self, the rebel will tend to obtain ex- 
treme self-esteera scores, revealing feelings of 
exaggerated worth or inferiority. 

The third hypothesis derived from the as- 
sumption of ego-weakness in the rebel is that 
of inadequate ego-ideal formation. Psycho- 
analysts regard the superego as an outgrowth 
of the ego, an outgrowth which depends upon 
the ego’s tension-binding function, which 
serves to’ delay impulse gratification until 
such gratification is compatible with demands 
imposed by society (Fenichel, 1954). Ordi- 
narily it is the parents who first mirror the 
society and who transmit societies’ codes and 
expectations through the application of re- 
wards and punishments. According to psy- 
choanalytic theory, the superego results from 
the child’s emotional acceptance and subse- 





46 Eva Maria Shippee-Blum 


quent internalization of parental demands 
and prohibition. Such an acceptance and in- 
ternalization requires that the ego be strong 
enough to renounce present pleasures for the 
sake of later (parental) approval. In the 
child this capacity to renounce is acquired 
as a consequence of the ego’s developing abil- 
ity to tolerate tension. 

But the ego of the rebellious adolescent is 
weak and cannot tolerate tension. The rebel 
is impulsive. His impulsiveness demonstrates 
his ego weakness. If his ego is too weak to 
bind tensions it should also be too weak to 
allow the internalization of parental demands 
which constitute the growth of the superego 
(Zucker, 1943). Thus it can be expected that 
the superego formation of rebels is defective. 
That is the hypothesis tested here. 

For the purpose of this experiment, only 
one aspect of superego defect was investi- 
gated. It was the assessment of the forma- 
tion of the ego-ideal, i.e., the formation of 
the conscious part of the superego. Ego-ideal 
was defined as the complex of positive cul- 
tural values, which stem from the parents and 
represent them, and which have been accepted 
by the child. These serve as a model which 
the child strives consciously to emulate. An 
attitude of positive esteem for the parents and 
their perceived qualities is implied in this defi- 
nition. 

The status of a child’s ego-ideal was meas- 
ured in terms of the relationship between the 
subject’s esteem for his parents and his own 
self-esteem. A strong ego-ideal formation was 
defined as a parent-esteem/self-esteem ratio 
equal to or greater than unity on an instru- 
ment designed to assess culturally valued at- 
tributes of the self and the parents. A weak 
ego-ideal formation was defined as a parent- 
esteem/self-esteem ratio smaller than unity 
on the same instrument. The hypothesis of 
defective ego-strength in the young rebel led 
to the prediction of the formation of a weak 
superego as measured by our instrument and 
defined as an ego-ideal score lower than unity. 
Prediction 3 stated that: On an ego-ideal 
measure, the rebel’s self-esteem will tend to 
exceed his esteem for his parents, i.e., he will 
tend to obtain a parent-esteem/self-esteem 
ratio below unity. 


The Sample 


The Ss used in this study were 75 high 
school students from the freshman and sopho- 
more classes of a suburban high school. Their 
modal age was 14 years, with a range from 
13 to 16 years. Their intelligence was aver- 
age, with a mean at the 54th percentile on 
the American Council of Education test bat- 
tery. The social status of the Ss’ parents was 
predominantly middle class. 

Procedure for categorization. In order to 
divide the preliminary sample into the rebel- 
lious and cooperative groups, independent ob- 
servations of the students were obtained. On 
the basis of these observations, the sample 
was divided into three groups according to 
the severity of the discipline problem which 
the Ss presented to the high school authorities. 
For the final sample of 75 Ss, only the two 
extreme groups were retained, the middle 
group was eliminated. 

Three sources of information were utilized 
to establish criteria for placement of each S 
into one of the three groups: the rebellious 
group, the moderate group, and the coopera- 
tive group. One source was the Ss’ citizenship 
grade obtained during an 18-week interval. 
For each S, 18 ratings were available (three 
ratings by each of his six teachers). 

The second source of information was the 
teachers’ rankings of all the students in the 
freshman and sophomore classes on a con- 
tinuum of rebelliousness. The third source of 
information was the dean’s black list. On that 
list were found the names and the offenses of 
those students who had not responded to the 
teachers’ disciplinary measures. This group 
of pupils represented the most serious disci- 
plinary problems calling for special handling. 

The three scores obtained from trichotomiz- 
ing the distributions of citizenship scores, 
teachers’ rankings, and presence on the black 
list were combined for each S. A new fre- 
quency distribution was made for the com- 
bined scores. Ss whose combined score was in 
the highest third were classified as rebels. Ss 
who obtained a zero score were designated as 
cooperators. The remaining middle group was 
eliminated from the final sample. The final 
sample of 75 Ss comprised 30 cooperators 
and 45 rebels. No significant differences were 
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found between the age and the intelligence 
of the Ss in the rebel and cooperator groups. 
Nor were any significant differences found in 
the occupational status of their parents. 


The Instruments 
The Ego-Control Questionnaire 


To measure ego-strength as a function of 
the individual’s capacity to bind tensions, i.c., 
to control impulse gratification, a question- 
naire was constructed. The method of its con- 
struction, the reliability and validity are dis- 
cussed in full by Luft (1957). A few examples 
from the ego-control questionnaire are given 
below: “I sometimes open presents before I’m 
supposed to.” “I usually make a plan before 
I start to do something.” “I am always calm 
and cool.” “I lose my temper easily.” 

The Adjective Check List 

To assess self-regarding attitudes and ego- 
ideal formation, an adjective check list was 
devised. Each adjective included in the list 
had to meet the following criteria: (a) it had 
to be intelligible to the average freshman, (5) 
it had to describe personality unambiguously 
either favorably or unfavorably in the areas 
of intellectual, emotional, interpersonal, and 
physical functioning, (c) it had to differenti- 
ate between Ss, and (d) it had to differenti- 
ate between the Ss and their fathers and 
mothers. 

Pretests on 72 high school sophomore and 
junior students yielded a preliminary pool of 
151 adjectives. These were screened by con- 
sulting the Thorndike-Lorge word list to 
eliminate all those adjectives which were be- 
yond the understanding of the average 13- 
year-old. In another pretest, the adjectives 
were further screened for intelligibility: the 
72 pretest students were asked to define each 
of the randomized adjectives by making sen- 
tences out of them. Only those words which 
the students had been able to define correctly 
were retained in the revision list. 

To establish the positive or negative value 


connotations of the adjectives empirically, 26 ° 


high school juniors were asked to judge each 
word. Eighty per cent agreement between Ss 
as to favorable implication had been decided 
upon as the cutting point. Accordingly, each 


word for which there was less than 80% 
agreement was eliminated from the list as be- 
ing too ambiguous a value statement to allow 
a clear differentiation of its positive or nega- 
tive connotation. 

The remaining 100 adjectives were pre- 
tested to weed out all those which did not 
differentiate sufficiently between Ss. An item 
analysis of the responses to the adjectives 
was made. Twenty-five adjectives which pro- 
duced the same response in all of the Ss were 
rejected. Seventy-five items were retained in 
the final form of the adjective check list. The 
final adjective check list was administered to 
two freshman classes (V = 63) to pretest its 
discriminatory power for three test condi- 
tions: (a) self description, (4) description 
of fathers, and (c) description of mothers. 
It was found that the final form of the adjec- 
tive check list differentiated adequately under 
all three conditions. 

The following 75 
final check list: 


words constituted the 


Well Dressed 
Confident 
Thoughtful 
Slow 
Humorous 
Insecure 
Enthusiastic 
Angry 
Cooperative 
Ashamed 
Selfish 
Quarrelsome 
Cheerful 
Helpful 
Sulky 
Creative 
Considerate 
Good Looking 
Popular 
Attractive 
Dull 
Highstrung 
Respected 
Wise 
Complaining 


Timid 

Jumpy 
Impatient 
Narrow Interests 
Graceful 

Jolly 

Blue 

Sunny Disposition 
Skillful 

Easily Hurt 
Nagging 
Moody 
Inventive 
Tired 
Awkward 
Merry 

Calm 

Alert 

Resentful 
Athletic 

Even Tempered 
Original 
Loving 
Intelligent 
Warm 


Foolish 
Worried 
Irritable 
Confused 
Bossy 

Nervous 
Bright 

Messy 
Vigorous 

Easy Going 
Obeying 
Generous 
Strong 
Relaxed 
Sociable 
Sympathetic 
Catch on Quickly 
Gentle 

Well Groomed 
Clear Thinking 
Contented 
Clumsy 
Agreeable 
Artistic 

Good Natured 

The adjective check list was administered 
to the experimental sample under three con- 
ditions: 

Condition A: Each S was asked to mark as 
true the adjectives which described him as he 
really was. 

Condition B: Each S was asked to mark as 
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true all the adjectives which described his 
father as he (the S) saw him. 

Condition C: The instructions for Condi- 
tion C were to mark as true those items which 
described the S’s mother correctly. 

The adjective check list was scored by 
counting separately the number of positive 
attributes by each S under each of the three 
conditions. Three total esteem scores were 
thus available for each S. The self-esteem was 
measured by the score obtained under Condi- 
tion A. The father-esteem was measured by 
the score obtained under Condition B, and 
the mother-esteem was measured by the score 
obtained under Condition C. 


The Ego-Ideal Measure 


For the purposes of this study, ego-ideal 
was defined as the ratio between the S’s 
esteem for his parents and the S’s self-esteem. 
Weak ego-ideal was thus defined as a parent- 
esteem score (P-E) lower than the self-esteem 
score (S-E); in other words, a weak ego-ideal 
was equated with a P-E/S-E ratio below 
unity. A strong ego-ideal was equated with 
a P-E/S-E ratio equal to or above unity. 

In order to obtain the ratio between the 
Ss’ parent-esteem and his self-esteem scores, 
the number of favorable adjectives checked 
for the father and the mother under Test 
Conditions B and C were added. This sum 
constituted the numerator in the P-E/S-E 
ratio. The denominator was the number of 
favorable adjectives the S had checked for 
himself under Test Condition A. A ratio fall- 
ing below 2.00 denoted weak ego-ideal. A 


Table 1 


Chi Square Table of the Relationship Between 
Low Ego-Control and Rebelliousness 


Ego-Control Coop 


Rebels Total 


erators 


29 39 


Oc 16 


Total 26 


Note.—r: = 43; x* = 4.99; .05 


ratio equal or above 2.00 denoted strong ego- 
ideal. 


Results 


Relationship between tension-binding-ca- 
pacity and rebelliousness. This study’s basic 
assumption of ego-weakness in the rebel led 
to the hypothesis of the rebel’s decreased 
tension-binding-capacity. Operationally, ten- 
sion-binding-capacity was defined as the ratio 
between impulse gratification vs. delay of 
gratification (IG/DG ratio). It was predicted 
that on a questionnaire designed to measure 
the IG/DG ratio the rebel will be on the 
under-controlling end of the distribution, i.e., 
he will obtain lower scores. 

The ego-control questionnaire was adminis- 
tered to 26 cooperators and 44 rebels. The 
tetrachoric correlation was computed as a 
measure of the degree of association between 
rebelliousness and under-control. Statistical 
analysis yielded an r; = .43. The chi square 
was 4.99, which for one degree of freedom is 
significant between the .05 and .02 levels of 
confidence. 

This degree of statistical significance sup- 
ports the hypothesis that decreased tension- 
binding-capacity as defined here and rebel- 
liousness are correlated. 

Relationship between 
esteem 


“unrealistic”  self- 
and rebelliousness. The hypothesis 
of decreased tension-binding-capacity in the 
rebel permitted the inference that the rebel 
will show deficiencies in reality-testing as, 
for instance, in his capacity to evaluate him- 
self realistically. Prediction 2 stated that the 
rebel’s unrealistic self-regard would be re- 
vealed on the adjective check list in extreme 
self-esteem scores, indicating feelings of ex- 
aggerated worth or inferiority. 

The obverse was predicted for the coopera- 
tor: his more realistic self-regard was ex- 
pected to yield moderate self-esteem scores. 

The adjective check list was administered 
to 30 cooperators and to 45 rebels. It was 
scored by counting for each S the number of 
positive adjectives checked. The difference in 
variance of the two distributions of scores was 
tested. The obtained sigma for the coopera- 
tors was 8.51; the sigma for the rebels was 
12.71. The observed difference was in the 
predicted direction and yielded an F ratio of 
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2.21. For 44 df and 29 df, respectively, the 
obtained F ratio is significant beyond the .01 
level of confidence. This degree of statistical 
significance supports the hypothesis that “un- 
realistic” self-esteem as defined by extreme 
scores on a self-esteem measure is a function 
of rebelliousness. 

Relationship between rebelliousness and 
ego-ideal weakness. A weakened ego-ideal in 
the rebel was predicted from the hypothesis 
of impaired ego-strength underlying asocial 
behavior and the consequent defective super- 
ego functioning. Prediction 3 stated that the 
ego-ideal ratios of rebellious adolescents 
would tend to be less than 2.00, or unity, 
more frequently than the corresponding ego- 
ideal scores of cooperative adolescents, which 
would be found more frequently above unity, 
ie., be equal to or greater than 2.00. 

The adjective check list was administered 
to 25 cooperators and to 39 rebels.* By sum- 
ming the father- and mother-esteem scores 
and dividing the result by the self-esteem 
score (F + M/S), the “ego-ideal” ratio was 
computed for each S. 

In order to show that the rebels, more fre- 
quently than cooperators, tend to reject their 
parents as ego-ideal, chi square was com- 
puted. The obtained chi square of 3.05 was 
significant beyond the .05 level of confidence 
(one-tailed test), for 1 df. The hypothesis 
that rebels have a weaker ego-ideal is sup- 
ported by the data. The results are summa- 
rized in Table 2. 


Discussion 


A great deal of knowledge has accumulated 
regarding the determinants of asocial behav- 
ior in the juvenile delinquent. Much less is 
known about the rebellious ado- 
lescent who has perpetrated similar misde- 
meanors but has not come to the attention 
of the The study presented here was 
concerned with the rebel before, rather than 
after, social intervention has branded him as 


‘average’ 


law 


a member of a special outgroup. Our rebel- 


8’ The number of Ss decreased from one 
ditions to another. Not all of those who had filled 
out the adjective check list under Condition A were 
willing to continue under Conditions B and C. The 
attrition cooperator and five 
rebels 


set of con- 


rate was SIX OS, one 


Table 2 
Chi Square Table of the Relationship Between 


Weak Ego-Ideal and Rebelliousness 


Ego-Ideal 
Ratio 


Coop 


erators Rebels 


1.00—1.99 
2.00-4.50 
Total 


Note s predicted direction); » < .05 


lious adolescents 
study’ in regular 


were at the time of the 
attendance at school. Al- 
though they were known to their teachers and 
counsellors as difficult to handle, they were 
more nearly representative of the adolescent 
population than the “legal delinquents” upon 
whose behavior much of the research on 
asocial behavior has been based. The study 
of the rebel was conceived of as contributing 
a link in the continuum of social behavior 
ranging from the extremely disruptive to con- 
structive activities. The psychoanalytic con- 
struct “impaired ego-strength” presented it- 
self as a convenient unifying principle, which 
allowed us to order divergent behavior mani- 
festations into one system. The positive find- 
ings of this study substantiated psychoana- 
lytical theory at several levels of abstraction. 
By extrapolating from the data, the primary 
explanatory principle was confirmed to the 
extent that the study affirmed the lower order 
hypotheses derived from’ the construct “im 
paired ego-strength.”’ 

There is no doubt that alternate hypothe- 
ses of the observed results can be proposed. 
These are critical 
offered here 
Many possibilities for extending the present 
research will also come to mind. Of these only 
one is mentioned here: the argument that the 
rebels were correct in their negative evalua- 
tion of their parents cannot be met directly in 
the absence of any objective descriptions of 
the two sets of parents. It may be that dis- 
passionate observers would tend to agree with 
the rebels that their parents 
mirable than parents of cooperators; 
haps not. In the event that 
correct in stating that their parents had few 


occur to the 
therefore not 


certain to 


reader and are 


were less ad- 
but per 


the rebels were 
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socially desirable traits, one could then as- 
cribe their defective superego to the inade- 
quacies of their models and the conflict be- 
tween their parental models and the ideals 
held up by the society in which the rebels 
live. If that is the case, superego development 
must be viewed in terms of the actual failing 
of parents to provide worthy behavior models 
to the child. On the other hand, should it be 
shown that objective evaluation of the rebels’ 
parents proves them not to differ from other 
parents in the possession of socially admir- 
able traits, then the qualities attributed to 
parents by rebellious offspring represent their 
own peculiarly dour views. Whether such 
views would originate from hostility, disap- 
pointment, or projection remains to be in- 
vestigated. 

The evaluation of the parents of rebels 
beckons as the next research step. 


Summary and Conclusions 


A sample of high school students was di- 
vided into 45 rebels and 30 cooperators. A 
questionnaire was constructed to measure ego- 
strength defined as tension-binding-capacity. 
Rebelliousness was correlated with decreased 
tension-binding-capacity. An adjective check 
list was developed. On it rebels revealed un- 
realistic self-regard, which differed from the 
realistic self-appraisal of the cooperators. 
Rebels were found to regard themselves more 


highly than they regarded their parents; co- 
operators admired their parents more than 
themselves. 

The results supported the psychoanalytic 
thesis of ego-weakness in the rebel. Implica- 
tions for further research were discussed. 


Received January 21, 1958. 
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AN APPROACH TO THE FACTOR STRUCTURE 
OF CLINICAL JUDGMENTS 


MARTIN D. CAPELL ann JULIAN WOHL 


VA Mental Hygiene Clinic, 


There is a growing awareness in the clinical 
field that the process of describing, rating, or 
classifying a person is extremely complex and 
that there are many determinants of the clini- 
cal impression or judgment. Previous studies 
(Buss & Gerjuoy, 1957; Grayson & Tolman, 
1950) of concepts used by clinicians have 
not been concerned with either the clinical 
judgment itself or with attempting empiri- 
cally to diimensionalize the concepts. Psy- 
chologists tend to be critical of their concepts 
and suspicious of themselves inasmuch as 
they have come to recognize that, in making 
judgments about other people, they do so not 
only as scientists but also as human beings. 
Recent interest in the utilization of counter- 
transference in the psychotherapeutic process 
(Cohen, 1954) is one aspect of this concern. 
Osgood (1953), in his work with the seman- 
tic differential, found that the judgments of 
words by groups of college students could be 
accounted for on the basis of three factors: 
evaluation, activity, and potency. The evalua- 
tive (“good-bad”) factor alone accounted for 
approximately 50% of the total variance in 
the judgments. In the present study, the fac- 
tor structure of clinical judgment is investi- 
gated with particular interest being given to 
the role of the evaluative factor in such judg- 
ments. 


Method and Procedure 


The psychiatric, social work, and psycho- 
logical staffs at the Detroit VA Mental Hy- 
giene Clinic, a total of 28 persons, were ap- 
proached individually and requested to give 
10 of the concepts they commonly used in 
describing patients. Following, each person 
was asked to indicate the “opposite” of each 
of the concepts he or she had contributed. 


Detroit 


The obtained concept-dimensions were cate- 
gorized by the Es. Twelve dimensions were 
selected on the basis of frequency of occur- 
rence (four or more). These were as follows: 


anxious—calm 

narcissistic, self-invested—object-invested 

inhibited, rigid, reserved-expansive, expres- 
sive, outgoing 

orally demanding-tealistic help-seeking 

flat, apathetic, affectless-sensitive, emotion- 
ally responsive 

defensive, guarded—open, free of defensive- 
ness 

distorted 
testing 

precarious, transient 
ships-stable, firm 
ships 

hostile, unaccepting, suspicious 
ing, accepting 

passive—active, aggressive 

dependent—independent 

inadequate—adequate, capable 


reality testing—undistorted reality 
relation- 
relation- 


interpersonal 
interpersonal 


friendly, trust- 


Three of the dimensions which Osgood found 
to load highly on the evaluation factor were 
added: 


pleasant—unpleasant 
honest—dishonest 
kind—cruel 
Finally, 
structs: 


five typical psychoanalytic con- 


much secondary gain-—little secondary gain 

positive Oedipal strivings—negative Oedipal 
strivings 

genital character—pregenital character 

many anal character traits-few anal charac- 
ter traits 
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Table 1 


Rotated Factor Matrix Derived from 
Scale Intercorrelations 


Factor® 


Scales* 





Anxious 
Narcissistic, self-invested 
Inhibited, rigid, reserved 
Pleasant 
Orally demanding 
Flat, apathetic, affectless 
Defensive, guarded 
Much secondary gain 
Positive Oedipal strivings 
Distorted reality testing 
Precarious, transient inter 

personal relationships 
Kind 
Genital character - 03 
Hostile, unaccepting, 

suspicious 54 
Many anal character traits ; 3 : 04 
Passive 5 48 
Many oral character traits 17 
Dependent 5 01 
Honest 5 5 06 
Inadequate 09 


Note.—Decimal points have been 
loadings. 

® Only the positive ends of the scales are reproduced below 
Consult the text for the complete scales 

> The mean squared factor loadings are as follows: I, .198 
II, .258; III, .096; and IV, .075 


omitted in the factor 


many oral character traits-few oral character 
traits 


were included. The 20 dimensions were then 
ordered randomly and reproduced on a dupli- 
cated form as a series of seven-point scales. 
Following a presentation of the case of a pa- 
tient to one of the clinic’s consulting psycho- 
analysts, every person in attendance (except- 
ing the Es) was asked to rate the patient on 
the scales. The professional discipline of the 
judge was also obtained, but the ratings were 
otherwise anonymous. The judges consisted 
of 3 psychiatrists, 11 clinical social workers, 
and 2 clinical psychology trainees, a total of 
16 persons. 

Product-moment intercorrelations among the 
scales were obtained, and a centroid factor 
analysis performed on the resulting matrix. 
Four factors were extracted. In order to de- 
termine the extent to which values did enter 
into the ratings, the first factor was rotated 


through the locus of the points which defined 
the three evaluative scales. The remaining 
factors were rotated blindly and orthogonally. 


Results and Discussion 


The rotated factor matrix is given in 
Table 1. The scale loadings on Factor I, the 
factor in which the evaluative component in 
the ratings was maximized, suggest that a 
number of clinical concepis have an evalua- 
tive meaning along with their diagnostic in- 
tent. In particular, such concepts as distorted 
reality testing, orally demanding, inhibited, 
and flat appear to reflect in large part “bad” 
characteristics, while genital character and 
positive Oedipal strivings have “good” impli- 
cations. In the light of this evidence, caution 
in the use of these terms in clinical parlance 
might well be considered. 

The mean squared factor loading on Fac- 
tor I (.198) constitutes approximately 30% 
of the common factor variance obtained. This 
can be contrasted with the results obtained 
by Osgood (1953), where roughly 70% of 
the common factor variance was evaluative. 
Thus, as compared with college students per- 
forming semantic ratings, clinical judgment 
appears to be relatively unbiased. This com- 
parison needs to be limited by the fact that 
the ratings in the present study are presum- 
ably denotative, rather than connotative as in 
Osgood’s study. 

Factor II, orthogonal to the evaluative fac- 
tor, appears to be a dimension of psychopa- 
thology and contains about 47% of the com- 
mon factor variance. It suggests that clinical 
judgment of pathology is predominantly un- 
biased by evaluative considerations. Factors 
III and IV do not appear to be as readily 
interpretable. Factor III is tentatively identi- 
fied as a dimension referring to the person’s 
desire and need for psychotherapeutic inter- 
vention. Factor IV appears to be an inhibi- 
tion or social withdrawal dimension. 

The present study can be considered only 
as exploratory, considering the small N in- 
volved. Nevertheless, the obtained factors 
follow along lines that might be expected in 
an outpatient, psychoanalytically oriented, 
psychotherapeutic agency. The clinic person- 
nel look at the range and degree of psycho- 
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pathoiogy, using generally psychodynamically 
derived concepts rather than those coming 
from descriptive psychiatry. They judge peo- 
ple as to their need and wish for treatment 
and with regard to their accessibility to a 
psychotherapeutic approach. A number of 
their concepts are endowed with considerable 
evaluative nuance. 


Summary 


The factor structure of clinical judgments 
was explored with particular interest being 
placed upon the role of values in the making 
ef such judgments. A group of mental hy- 
giene clinic personnel was asked to rate 
a patient on a number of scales derived 
from commonly used clinical concepts. Three 
“evaluative” scales and five psychoanalyti- 
cally derived scales were also included. The 
scales were intercorrelated and a factor analy- 
sis performed. The first factor was rotated 
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through the evaluative scales. Less than one 
third of the common factor variance was ac- 
counted for by this factor; it suggests that 
evaluation does play a part in clinical judg- 
ment. The remaining three factors were ten- 
tatively identified as dimensions of psycho- 
pathology, need and wish for psychotherapy, 
and social inhibition. 


9? 
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There is now considerable literature (Fisher 
& Cleveland, 1955; Fisher & Cleveland, 
1958; Witkin, Lewis, Harztman, Machover, 
Meissner, & Wapner, 1954) devoted to the 
study of body image problems. It has_be- 
come apparent that the individual’s attitudes 
toward his body not only supply valuable 
information about his past socialization ex- 
periences but aiso provide a basis for pre- 
dicting important aspects of his behavior. 
Body image measures have been successfully 
used to clarify such diverse phenomena as 
the appearance of phantom limb sensations 
following amputation (Haber, 1956); the 
content of schizophrenic delusions and hal- 
lucinations (Fenichel, 1945); differential ca- 
pacity to deal with stress (Fisher & Cleve- 


land, 1958); and the ability to make spatial 
judgments in unstructured settings (Witkin 


et al., 1954). In earlier work the writer 
focused particularly on the possibility of 
predicting individual patterns of physiologi- 
cal reactivity in terms of body image vari- 
ables. This work aimed to verify the gen- 
eral proposition that the individual’s atti- 
tudes toward his body may infiuence its 
physiological reactivity. Data were obtained 
which indicated that certain patterns of 
physiological response can be predicted from 
body image variables. Thus, it was estab- 
lished (Fisher & Cleveland, 1957; Fisher & 
Cleveland, 1958) that the more definite an 
individual’s body image boundaries the more 
likely he is to show relatively greater reac- 
tivity in the outer body layers than in the 
body interior. More recently, observations 


were made which indicate that individuals 


1 This study was supported by a Public Health 
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manifest differences in the GSR reactivity of 
their right and left body sides and that the 
pattern of these differences is linked with 
body image factors. 

It is with this problem of right vs. left re- 
activity that the present paper concerns it- 
self. Previous findings concerning this phe- 
nomenon have been as follows (Fisher, 1958; 
Fisher & Abercrombie, in press): 

1. Right-handed subjects (Ss) who, while 
wearing aniseikonic lenses, clearly distinguish 
the left body side as smaller than, or inferior 
to, the right side in their body image evalua- 
tions are significantly more likely than those 
not making this distinction to manifest a 
GSR gradient such that the left side is more 
reactive than the right side. 

2. Right-handed Ss who are secure about 
their over-all body images (as measured by 
reaction to tachistoscopically presented pic- 
ture of mutilated bodies) show relatively 
greater left than right GSR _ reactivity; 
whereas Ss who are insecure about their body 
images either show no gradient at all or rela- 
tively more right than left reactivity. The 
GSR gradient associated with a 
well integrated body image is the left direc- 
tional one. 


optimum 


3. Left-handed Ss fail to evidence signifi- 
cant relationships between body image meas- 
ures and the GSR gradient variable. 

A primary goal of the present project was 
to determine how these findings would stand 
up when approached in terms of new body 
image derived from figure 
Figure drawings were chosen for spe- 
cial study in this respect because the Draw- 
A-Person method has in the past been used 
more than any other measure for evaluating 


measures draw- 


ings 
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body image variables (Abel, 1953; Bender 
& Keeler, 1952; Fisher & Cleveland, 1958; 
Silverstein & Robinson, 1956; Witkin et al., 
1954). 

The attainment of the first primary objec- 
tive resolved itself into two specific problems: 

1. To determine if GSR gradient patterns 
are linked with the right side vs. left side 
attributes of figure drawings. The specific 
prediction would be that those Ss who depict 
the left side of the figures as smaller than the 
right side would show relatively greater left 
GSR reactivity than right reactivity. Those 
drawing the right side of the body as rela- 
tively smaller or who depict no differences 
between the two body sides would fail to 
show the optimum left directional GSR 
gradient. 

2. To determine if GSR gradient patterns 
are related to indices of over-all body image 
integration derived from figure drawings. 
Here the specific prediction would be that 
Ss whose drawings evidenced good body im- 
age integration would be more likely to be 
typified by the norm left reactive GSR gradi- 
ent than Ss whose drawings evidenced poor 
body image integration. 

A second primary intent of the study was 
to test, by means of figure drawings, an hy- 
pothesis proposing that the existence of an 
optimum GSR gradient between the two body 
sides is at one level a function of whether the 
individual has jearned a clear distinction be- 
tween male and female sex roles such that 
the male is considered superior in size and 
strength. This hypothesis was the product of 
a rather speculative train of logic. It grew 
out of a proposition, which has been much 
elaborated elsewhere (Fisher & Cleveland, 
1958), that an individual’s body image atti- 
tudes do not reflect the actual appearance 
of his body but rather express or summarize 
patterns of important socialization experi- 
ences. This point was well exemplified in the 
finding that the definiteness which an indi- 
vidual ascribes to his body boundaries is not 
correlated with his actual physique as meas- 
ured by body type but is significantly linked 
with the definiteness of certain roles and 
values he learned from the parental figures 
(Fisher & Cleveland, 1958). Findings of 
this sort suggested that body image attitudes 


were primarily a projection onto a figurative 
“body screen” of important roles and concepts 
which had been learned. 

When, therefore, differences were found in 
the size and strength attributed by individu- 
als to their two body sides, and when these 
differences proved to be correlated with GSR 
gradients of reactivity, the question arose as 
to what underlying socialization variable or 
experience was being thus reflected. There 
were already observations in the literature 
(Fenichel, 1945; Wolff, 1943) which implied 
that attitudes toward the two body sides were 
influenced by conflicts about masculinity- 
femininity. It had been noted that in some 
dramatic instances schizophrenic individuals 
would act out sex role conflicts by developing 
the delusion that one side of their bodies was 
male and the cther side female. Sufficient 
leads of this kind could be found so that it 
seemed logical to relate differences in right- 
left body image and the correlated GSR re- 
activity differences to sex role variables. One 
could reason that the definiteness of an indi- 
vidual’s right-lefi body image and GSR dif- 
ferences was correlated with how well he had 
learned a stable nonconflictual distinction be- 
tween the masculine and feminine roles. How- 
ever, it seemed doubtful that a distinction 
which attributed relatively greater power to 
the female than to the male and which there- 
fore was opposed to the usual cultural stereo- 
type of masculine-feminine size relationships 
could be maintained by the individual with- 
out a great deal of anxious confusion. So the 
final hypothesis which emerged proposed that 
a definite and optimum GSR reactivity gradi- 
ent between the two body sides was based on 
a significant body image distinction in the 
size attributed to the two body sides and that 
this distinction was in turn dependent upon 
the individual having learned to discriminate 
clearly between the male and female sex roles 
in such a fashion that the male was consid- 
ered to have greater size and power. 


Procedure 
Measure of Right-Left Body Reactivity 


The method for measuring relative right- 
left GSR reactivity has already been de 
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scribed in detail elsewhere (Fisher, 1958; 
Fisher & Abercrombie, in press) and there- 
fore will be only briefly outlined. It is of the 
exosomatic type and involves simultaneous 
recordings from the left and right sides on 
separate channels against a relatively inac- 
tive upper arm. Resistance changes across 
the skin are picked up as changing potential 
and fed into a Grass EEG power amplifier. 
Recordings were taken simultaneously: first 
from the middle fingers; secondly from the 
forefingers; then from the fingers next to the 
little fingers; and finally from two corre- 
sponding points on the palms. The currents 
to each finger were carefully matched. Also, 
a common calibrating resistor in series with 
the upper arm checked the equivalence of 
the sensitivities of the two circuits. The area 
of recording from each site was controlled by 
masking off all but a fixed circular region by 
means of punched tape. For each S there was 
finally available side by side the continuous 
GSR responses for each set of right-left sites. 
The amplitudes of response at two homolo- 
gous sites at a given time could then be 
measured and compared. If the differences oc- 
curring at a given site were such that twice 
as many favored one side over the other and 
if at least eight were present, then it was 
judged that a directionality for that site has 
been demonstrated. When an S showed this 
kind of directionality in favor of one side for 
three out of four sites, he was considered to 
have a definite GSR gradient in that direction. 


Figure Drawing Measures 


Each S was asked to draw a full-length pic- 
ture of a person. When he had finished his 
first sketch, he was asked to draw a person of 
the opposite sex. The usual instructions were 
modified to the extent that Ss were told to 
draw only front views of each figure. It was 
necessary to do this in order to have both the 
right and left sides of each figure fully repre- 
sented. 

The following measures were derived from 
the drawings: 

A. Right-Left Size Differences. Size differ- 
ences between the right and left sides of each 
drawing were determined on the basis of the 
relative dimensions of the right and left arms 


and also of the right and left legs. Arm and 
leg differences were chosen not only because 
they are often the most prominent indicators 
of asymmetry in figure drawings but also be- 
cause other indices involving the trunk and 
head were highly unreliable due to the diffi- 
culty in establishing “middle” reference points 
from which to make right vs. left measures. 
The length of each of the figure’s arms was 
measured with a protractor from a point half 
way between the indicated boundaries of the 
uppermost part of the arm to a point at the 
tip of the longest finger. The length of each 
leg was measured from a point half way be- 
tween the boundaries of the leg at the point 
where it joined the middle of the body (crotch 
area) to-a point indicated by the most distant 
toe. This leg measure was made only on the 
drawing of the man because in the instance 
of the female drawing the dress almost al- 
ways concealed the legs. 

B. Over-all Body Image Disturbance. This 
measure was derived from one which was 
previously used by Witkin et al. (1954). It 
was developed by Machover in an attempt to 
predict on the basis of the individual’s body 
concept how well he would perform on tasks 
requiring spatial judgments in very unstruc- 
tured situations. Machover compiled an ex- 
tensive list of about 40 figure drawing signs 
which she considered on the basis of clinical 
experience to indicate lack of body confidence 
and general difficulty in developing an ac- 
ceptable body concept. Scores computed from 
these signs proved to differentiate signifi- 
cantly between Ss who presumably could use 
their bodies adequately as a frame of refer- 
ence for making spatial judgments and those 
who could not do so. In reviewing Machover’s 
list of signs, it became apparent that many 
of them involved making rather complex sub- 
jective judgments. It was therefore decided 
to use only 14 of the signs which were based 
on simple and relatively objective judgments. 
One penalty point, indicating disturbance in 
body concept, was assigned for the presence 
of each of the following signs: 


1. Erasures 
Transparency such that the figure defies the 
laws of perspective as regards the masking of ob- 


jects when they are behind others 
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. Lack of any body part. 

. Nose indicated only by two nostril dots. 

. Mouth indicated only by a line. 

. One or more arms behind back. 

. Very crude or peculiar clothing. 

. Lack of breasts in the female figure. 

9. Shading of the body. 

10. Lack of margins and delimiting lines in the 
figure (e.g., cuffs, collar, hemline). 

11. Figure markedly off balance 

12. Figure very small (less than one-half the length 
of the page). 

13. Markedly unusual shading or elaboration of 
the crotch area. 

14. Opposite sex drawn first. 


A total score was tabulated which equaled the 
sum of the penalty points assigned to both of 
the figures. It will be referred to as a “Body 
Disturbance” score. The scoring was done 
blindly with all subject 
moved. 


identification re- 


Subjects 


The Ss consisted of 34 men and 16 women. 
Their median age was 21 and their median 
educational level was 14 years. They were 
mainly students who were recruited from col- 
lege organizations by paying each organiza- 
tion a fee to have its total membership par- 
ticipate in the experiment. Only right-handed 
Ss were used because previous work (Fisher, 
1958) had indicated that due to complicating 
factors the relationships of body image and 
GSR variables are masked in left-handed in- 
dividuals. 

C. Relative Size Attributed to Masculinity 
vs. Femininity. The assumption was made 
that the individual’s concept of the relative 
size and power attributes of the male vs. fe- 
male roles would be indicated by the relative 
heights of the male and female figures he 
drew. This was purely an operational pro- 
cedure. Each drawing was evaluated by meas- 
uring from the tip of the head to the furth- 
est distant point on one of the legs. Only 
when the male figure was at least one-half 
of an inch larger than the female figure was 
the difference considered significant. This cri- 
terion was adopted in order to minimize 
chance differences in size which might result 
from stray pencil stroke extensions. 


Results 


Tabulation of the data indicated that 21 
Ss were left directional in their GSR gradi- 
ents; 9 were right directional; and 20 mani- 
fested no gradient at all. A chi-square analy- 
sis demonstrated only a chance relationship 
between directionality of GSR and figure 
drawing right-left side size differences. This 
held true for the figure drawing arm meas- 
ures considered separately and also for the 
leg measures. Figure drawing asymmetry 
seems to bear no relationship to GSR re- 
sponse asymmetry. 

However, a significant relationship in the 
predicted direction was found by means of 
chi square between the figure drawing Body 
Disturbance scores and GSR directionality. 
Ss scoring below the median in Body Dis- 
turbance significantly (.001 level) more often 
manifested left directional GSR gradients 
than Ss above the median in Body Disturb- 
ance. Those above the median more often 
showed either no GSR gradient at all or 
right directionality of response. 

Eighteen Ss drew the male figure as larger 
than the female figure. Thirty-two drew the 
female figure as equal in size or smaller than 
the male figure. There were no sex differ- 
ences among the Ss in this respect. Chi- 
square analysis of the relationship of the 
GSR gradient variable to the sex size differ- 
ence variable indicated a significant relation- 
shir in the predicted direction. Those Ss who 
manifested the left reactive GSR gradient 
drew the male figure as larger than the fe- 
male figure significantly (.05 — .02 level) 
more often than did Ss who had no GSR 
gradient or who were right reactive. 


Discussion of Results 


A significant link between GSR direction- 
ality and asymmetry of the figure drawings 
was not demonstrated. This suggests that fig- 
ure drawings do not tap body image asym- 
metry attitudes in the same fashion as the 
aniseikonic lens technique (Fisher, 1958), 
which was first used to demonstrate a sig- 
nificant relationship between body image 
asymmetry and GSR right-left gradients. The 
aniseikonic lens measure required Ss to make 
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a series of size comparisons of their homolo- 
gous right and left fingers. This was done un- 
der conditions in which their perceptual fields 
were so distorted as to encourage projection 
of autistic attitudes. The aniseikonic lens 
method for evaluating body image asymmetry 
contrasts with the figure drawing method in 
two ways. First of all, it requires direct rather 
than indirect judgments about the individu- 
al’s own body. Secondly, it sets the specific 
task for the individual of making judgments 
regarding symmetry; whereas the figure draw- 
ing asymmetry measure is derived from re- 
sponses to a task having no specific symmetry 
connotations. This implies that if one wants 
to measure body image attitudes toward very 
specific areas or dimensions of the body, such 
a measure is most likely to be successful if it 
requires direct evaluation of these areas of 
the S’s own body. 

However, the results obtained relative to 
the figure drawing Body Disturbance score 
indicate that such specific evaluation by the 
individual of his own body is not necessary 
in order to measure more general and ab- 
stract body image dimensions. That is, a sig- 
nificant relationship was found between the 
Body Disturbance score and the GSR gradi- 
ent variable. Similarly, in a previous study 
(Fisher & Abercrombie, in press) it was 
possible to establish a significant relationship 
in the same direction between GSR gradient 
and an index of general body image disturb- 
ance based on reactions to tachistoscopically 
presented mutilated figures. This index, like 
the figure drawing score, was an indirect 
measure which in no way required the S to 
make judgments about his own body. 

The demonstration of a link between the 
Body Disturbance score and the GSR re- 
sponse pattern adds further validity to the 
proposition that body image attitudes influ- 
ence the body’s reactivity characteristics. But 
the most intriguing finding of the present 
study is that supporting the hypothesis con- 
cerning the relationship of GSR response pat- 
terning to definiteness of sex role distinction. 
Such a finding gives a bit of substance to the 
three-level general theory which has been 
proposed concerning role, body image, and 
physiological reactivity. There is now at least 


one empirical result, as modest and tentative 
as it may be, which indicates that a body re- 
activity pattern which has been shown to be 
related at one level to body image attitudes 
is at still another level related to the manner 
in which an individual conceptualizes an im- 
portant life role. The presumption is that the 
body image attitude represents a translation 
into body terms of the role concept. But this 
remains to be proven. A study is now under 
way which attempts such proof by tracing 
the relationships of GSR patterns, body im- 
age attitudes, and role attitudes over a wide 
age range from early childhood through adult- 
hood. 


Summary 


Previous work demonstrated relationships 
between body image measures and GSR right- 
left reactivity gradients. In the present study, 
body image measures derived from figure 
drawings failed to confirm a previous find- 
ing regarding the relationship of body image 
right-left size asymmetry and GSR direction- 
ality. However, a figure drawing measure of 
over-all body image disturbance was found to 
be linked in the predicted direction with GSR 
right-left responsivity. Further, a significant 
relationship was established between a figure 
drawing measure of the individual’s concept 
of the relative size and strength attributes of 
the male and female roles and GSR reactivity 
gradients. The findings lend support to a gen- 
eral theory that relates role, body image, and 
physiological reactivity. 
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POPULATION DIFFERENCES IN CONSTRUCT 
VALIDITY 


GERALD 5S. LESSER 


Hunter Colleg 


In the seldom definitive efforts to validate 
measures of personality variables, it fre- 
quently happens that a test of “construct va- 
lidity” (American Psychological Ass., 1954) 
performed on a particular population will 
not be confirmed by an otherwise identi- 
cal validating test performed on a different 
population. This phenomenon has appeared 
in a variety of research areas, including at- 
tempts to establish the construct validity of 
projective measures predicting response to 
stress (Carlson & Lazarus, 1953; Williams, 
1947), projective signs of organic brain dam- 
age (Dorken & Kral, 1952; Fisher, Gonda, 
& Little, 1955), and measures of authori- 
tarianism and anxiety (Davids, 1955; French 
1955). In each pair of studies, identical meas- 
ures of personality variables were used, but 
different populations were employed and dif- 
ferent results were obtained. 

Interpretations of failures to demonstrate 
consistent validity with diverse populations 
ordinarily focus upon the limitations of the 
measure being validated. However, there have 
been few attempts to assign psychological 
meaning to the differences between the popu- 
lations which produce the differential validity 
results. Why do certain populations yield 
positive validating findings, while other popu- 
lations, in response to identical measuring in- 
struments, yield inconclusive or negative re- 
sults? 

The answer proposed here is a _ special 
application to validation studies of the well- 
accepted general proposition that the relation- 
ships among personality measures will de- 
pend upon the context of variables in which 
the personality measures are obtained. The 
present study demonstrates that (a) a single 
measure may display differential validity for 


different populations, and (6) psychological 
variables may be identified which will allow 
prediction of the nature of the differential 
validation results for different populations. 

The variable of anxiety was selected as one 
of the possible dimensions of population dif- 
ference affecting validation findings. Empiri- 
cal evidence (e.g., Grinker & Spiegel, 1945; 
Luria, 1942; Mahl, 1949) from diverse areas 
of psychological research links anxiety with 
disruption, disorganization, and erratic qual- 
ity of behavior. Kuhlen emphasizes this rela- 
tionship between anxiety and response vari- 
ability, concluding that “anxiety will show in 

shifts of mood, instability, unpredict- 
ability and inconsistency in perform- 
(Kuhlen, 1952, p. 268). In addition, 
many Clinical discussions have centered upon 
the influence of anxiety upon the validity of 
diagnostic test results. 

The present study does not, however, deal 
directly with the effects of general, pervasive 
anxiety but rather with anxiety specifically 
related to the expression of aggression. The 
assumption is made that this more circum- 
scribed anxiety will manifest certain conse- 
quences which resemble the effects of general 
anxiety, i.e., anxiety about aggression will be 
associated with inconsistency in expressions 
of aggression at different times and in re- 
to different measurement techniques. 
Individuals and populations exhibiting high 
anxiety 


ance’ 


spe ynse 


over the expression of aggression 
should manifest greater variability in aggres- 
sive response and, hence, should exhibit lower 
intercorrelations among measures of aggres- 
he construct validity of measures of 
aggression should be difficult to establish in 
such populations. 

Direct assessment of anxiety about aggres- 


sion 
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sion, independent of aggressive need strength, 
encounters formidable methodological difficul- 
ties. Indirect methods are suggested by the 
literature. For example, convincing evidence 
(Dollard, Doob, Miller, Mowrer, & Sears, 
1939; Hollenberg & Sperry, 1951; Miller, 
1948; Whiting & Child, 1953) exists for 
the relationship between inhibitory socializa- 
tion of aggression and the learning of anxiety 
about aggression. An indirect index of the 
strength of aggression anxiety in preadoles- 
cent boys was obtained by measuring the at- 
titudes and practices toward aggression so- 
cialization of their mothers. 


Hypothesis 


The population difference examined in this 
study represents a difference in the conditions 
of learning to express aggressive responses 
and the resulting difference in anxiety about 
aggression. 

The hypothesis tested is that, under con- 
ditions of high anxiety over the expression of 
aggression, the intercorrelations among vari- 
ous measures of aggression are significantly 
lower than under conditions of low anxiety 
over the expression of aggression. 


Method 
Subjects 


The Ss were 44 white boys (ages 10-0 to 
13-2) and their mothers. The boys were 


drawn from fifth grade and two sixth 
grades in two public schools. All of the boys 
and their mothers in these three classes par- 
ticipated, except one mother who refused to 
be interviewed. The Kuhlmann-Anderson in- 
telligence quotients of the boys ranged from 
82 to 119, with a mean of 102. The two 
schools are in adjacent districts and the fami- 
lies constitute a relatively homogeneous upper 
lower-class group. 


one 


Maternal Attitudes and Practices 


Only one aspect of the environmental con- 
ditions of learning of aggressive behavior was 
measured, i.e., the maternal attitudes and 
practices supporting or prohibiting aggression. 
A structured questionnaire-interview sched- 
ule was orally administered to the mothers in 
their homes by a male interviewer. Questions 
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regarding the support or prohibition of ag- 
gression constituted only one segment of the 
total interview; the entire interview schedule 
is described in detail elsewhere (Lesser, 1952). 
Pertinent to the present study were 8 items 
concerning the mother’s attitudes toward ag- 
gression in children, and 13 items about the 
mother’s practices in dealing with the aggres- 
sive behavior of her child. An illustrative item 
measuring maternal attitudes toward aggres- 
sion is: “A child should be taught to stand 
up and fight for his rights in his contacts 
with other children.”” The four response al- 
ternatives of agree, mildly agree, mildly dis- 
agree, and disagree were allowed for this item 
An example of an item measuring maternal 
practices concerning aggression is: “If your 
son comes to tell you that he is being picked 
on by a bully at the playground who is his 
own age and size, there would be a number 
of different things you might tell him. Would 
you tell him to ignore him and turn the other 
cheek?” Response alternatives for this item 
were Yes and No. Items that did not involve 
judgments on a four-point scale were trans- 
formed to have approximately the same range 
of scores as the items that involved four al 
ternatives. 

A single score was obtained for each mother 
by combining all items, assigning plus scores 
to the responses indicating support of aggres 
sion and minus scores to responses indicating 
discouragement of aggression. The range of 
scores was from +9 to — 7, with a median 
score of +2. The corrected odd-even reli- 
ability coefficient was .80. 

The distribution of scores for maternal 
response to aggression was dichotomized to 
form one group of mothers (with scores above 
or at the median) whose attitudes and prac- 
tices were more supportive of aggressive be- 
havior than those of the other group (with 
scores below the median). The hypothesis de- 
mands that the intercorrelations among vari- 
ous measures of aggression for the children 
(N = 23) of the mothers who support ag- 
gression be significantly more positive than 
the intercorrelations for the children (N 
21) of the mothers who oppose aggression 
Boys whose mothers encourage aggression will 
be referred to as the Low Anxiety (LA) 
group; boys whose mothers discourage ag- 


ar 
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gression will be called the High Anxiety 
(HA) group. 


Measures of Aggression 


Children’s Form of Rosenzweig Picture- 
Frustration Study. Measures of extrapunitive- 
ness (E), intropunitiveness (1), and impuni- 
tiveness (M) were obtained through stand- 
ard group administration and scoring of the 
Children’s Form of the Picture-Frustration 
Study (Rosenzweig, Fleming, & Rosenzweig, 
1948). 

The present sample of 44 boys differs sig- 
nificantly (p < .05) from Rosenzweig’s nor- 
mative group of 10- to 13-year-old boys, 
manifesting higher E scores (55% vs. 42%), 
lower I scores (21% vs. 28%), and lower M 
scores (24% vs. 30%). 

The corrected matched-half reliability co- 
efficients for the group of boys whose mothers 
encourage aggression (Low Anxiety group) 
were: E, .85; I, .58; M, .82. For the group 
of boys whose mothers discourage aggression 
(High Anxiety group), the corresponding re- 
liability coefficients were: E, .95; I, .73; M, 
.70. Differences in reliability coefficients for 
the two groups were not statistically signifi- 
cant. These reliability coefficients are some- 


what greater than those typically reported 


for the Picture-Frustration measures. This 
may be a function of the fact that the 
matched-half technique for computing reli- 
ability was used in the present study. 

Interjudge reliability of the scoring pro- 
cedure was evaluated by percentage of agree- 
ment between two judges: E, .86; I, .82; M, 
84. 

Fantasy Aggression. Fantasy aggression in 
the children was measured through an adap- 
tation of the TAT procedure (Murray, 1943, 
pp. 3-5). A set of 10 pictures was designed. 
In each picture two boys are interacting. The 
pictures differed from one another in the de- 
gree to which the instigation to aggression 
was apparent. 

To insure complete and accurate transcrip- 
tion of the stories, tape recordings were taken. 
An introductory period preceding the fantasy 
task served both to establish rapport between 
the child and the male examiner, and to fa- 
miliarize the child with the recording device. 
Instructions were: 


I’m going to show you some pictures. These are 
pictures of two boys doing different things. What I’d 
like you to do is make up a story to each of these 
pictures. You can make up any story you wish; 
there are no right or wrong stories. Say what the 
boys are thinking and feeling and how the story will 
turn out. 


The 10 pictures, in the order of presenta- 
tion, were: 


1. One boy is holding a basketball and the other 
boy is approaching him with arms outstretched. 

2. One boy is stamping upon an ambiguous ob- 
ject and the other boy is reaching for the object. 

3. One boy is sitting behind the other boy in a 
classroom and is leaning toward him. 

4. One boy is walking down the street, and the 
other boy, with fists clenched, is glaring at him. 

5. One boy, with fists clenched, is staring at the 
other boy who is sitting, head bowed, on a box. 

6. One boy is sawing a piece of wood, and the 
other boy is leaning on a fence between them, talk- 
ing to him. 

7. The two boys, surrounded by a group of other 


boys, are approaching each other with arms upraised 
and fists clenched 

8. The two boys are making a fire. One boy is 
kneeling to arrange the wood and the other boy is 
approaching, ladened with wood for the fire. 

9. One boy, who is looking back, is running down 
a street, and the other boy is running behind him. 

10. Two boys are standing in a field. One boy, 
with his hand on the other boy’s shoulder, is point- 
ing off in the distance. 


A fantasy aggression score was obtained for 
each S by counting the number of times the 
following acts appeared in his stories: fight- 
ing, injuring, killing, attacking, assaulting, tor- 
turing, bullying, getting angry, hating, break- 
ing, smashing, burning, destroying, scorning, 
expressing contempt, expressing disdain, curs- 
ing, swearing, threatening, insulting, belit- 
tling, repudiating, ridiculing. 

Fantasy aggression scores ranged from 1 
to 15, with a mean of 5.3. The corrected 
matched-half reliability coefficient for the 
group of boys whose mothers encourage ag- 
gression (Low Anxiety group) was .88. For 
the group of boys whose mothers discourage 
aggression (High Anxiety group), the reli- 
ability coefficient was .85. 

Interjudge scoring reliability coefficient was 
92. 

Overt Aggression. To measure overt aggres- 
sion in the child, a modified sociometric de- 
vice, the “Guess Who” technique, was adopted. 
The Ss were presented with a booklet con- 
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taining a series of written descriptions of chil- 
dren, and asked to identify each of these 
descriptive characterizations by naming one 
or more classmates. Fifteen overt aggression 
items were used, such as “Here is someone 
who is always looking for a fight.” A diversity 
of aggressive behaviors were included; items 
depicted verbal, unprovoked physical, pro- 
voked physical, outburst, and indirect forms 
of aggresive behavior. 

An overt aggression score was obtained for 
each S by counting the number of times he 
was named by his classmates. There were sub- 
stantial differences among the three classes 
in the distributions of the overt aggression 
scores; in order to combine into one distribu- 
tion the scores of children in different classes, 
overt aggression raw scores were transformed 
into standard scores. 

The biserial correlation coefficient between 
the overt aggression measure derived from 
the children and teacher entries for the same 
“Guess Who” aggression items was .76 (> - 
01). 


Results 


The intercorrelations among measures of 
extrapunitiveness, intropunitiveness, and im- 
punitiveness (derived from the Picture-Frus- 
tration Study), fantasy aggression (derived 
from a modified TAT procedure), and overt 
aggression (measured by a near-sociometric 
device) are presented in Table 1. The inter- 
correlations are presented separately for the 
Low Anxiety group, the High Anxiety group 
and both groups combined. 

All correlations are Pearson product-mo- 
ment coefficients. Over-all correlations were 
obtained by pooling within group sums of 
cross products and sums of squares; this pro- 
cedure is described by, for example, Mc- 
Nemar (1955). All probability values are 
based upon two-tailed tests of significance. 
Scatter plots revealed some skewness in a 
number of distributions; hence, all distribu- 
tions were normalized for calculation of Pear- 
son product-moment correlations. 

This study proposes that each measure of 
aggression employed constitutes a validating 
criterion for every other measure. For all 
seven correlations among the Picture-Frustra- 
tion, TAT, and near-sociometric measures, the 
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Table 1 
Intercorrelations Among Extrapunitiveness (E), Intra 
punitiveness (I), Impunitiveness (M), Fantasy Aggres 
sion (FA), and Overt Aggression (OA) for Low Anxiety 
Group, High Anxiety Group, and Both Groups Combined 


M ’ OA 


Note n cl t of three « 
coefficient ret nts the correlation t 
N = 23), the second coefficient represents 
the High Anxiety grou; \ | 
represents the combined correlat 
< OS (two-tailed test 
O01 (two-tailed test 
ow Anxiety group significant]; 
ively) fron 


High Anxiety gro 

coefficients for the Low Anxiety group are 
greater (in a direction indicating greater va- 
lidity of the measures) than for the High 
Anxiety group. In four of the seven compari- 
sons, the difference between Low Anxiety and 
High Anxiety groups is significant beyond the 
.O5 level of confidence. 

The extent of the difference between Low 
Anxiety and High Anxiety groups is most ap- 
parent for the relationships of E, I, and M 
with overt aggression and for the relationship 
of fantasy aggression with overt aggression. 
The direction of difference in favor of the 
Low Anxiety group is maintained for the re- 
lationships of E, I, and M with fantasy ag- 
gression, but to a lesser degree. Thus, differ- 
ences in construct validity which are depend- 
ent upon the population difference in degree 
of anxiety about aggression are clearly dis- 
cernable for the relationships between non- 
overt (Picture-Frustration and TAT) and 
overt (near-sociometric) measures of aggres- 
sion and are less discernable for relationships 
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among the nonovert measures (Picture-Frus- 
tration and TAT) of aggression. 

The intercorrelations among E, I, and M 
measures, which appear in Table 1, do not 
possess the same status as independent com- 
parisons that the intercorrelations between 
either E, I, and M and the measures of overt 
an fantasy aggression possess. The closed 
nature of the scoring system for the Picture- 
Frustration responses automatically demands 
negative correlations among E, I, and M, 
since these three measures have only 2 df. 
That is, a high E score will automatically be 
accompanied by a low I score, a low M score, 
or both. Positive validity results require nega- 
tive correlations between E and I and be- 
tween E and M. Such negative correlations 
are demanded equally for both the Low Anx- 
iety and High Anxiety groups. However, the 
negative correlations between E and I and 
between E and M in Table 1 are greater for 
the Low Anxiety group than for the High 
Anxiety group. 


Discussion 


When psychological studies produce dis- 
crepant findings, it has become routine to 
state that population differences may have 


produced the discrepancies. Differences in va- 
lidity findings in studies employing identical 
measures are often vaguely explained in this 
manner. However, few attempts have been 
made to systematically define the psychologi- 
cal variables contributed by population dif- 
ferences and to empirically test the effects of 
these differences upon validity. In this study, 
population differences in degree of anxiety 
about aggression were shown to affect the 
construct validity results for a number of 
measures of aggression. 

The general form of the conclusion drawn 
from this result is that there may be inter- 
active effects between measures of personality 
variables and measures of psychological char- 
acteristics of the populations upon which the 
measures are obtained which influence other 
measures of the personality variable. Validity 
findings will be obscured by failure to con- 
sider such interactions. 

The results of this study indicate that the 
validity of nonovert and overt measures of ag- 
gression can be more easily demonstrated for 


a group of boys who have low anxiety about 
aggression. The population difference in anx- 
iety about aggression produces greater vari- 
ability or inconsistency between nonovert and 
overt measures of aggression than among non- 
overt measures of aggression. 

Certain limitations of the present study are 
apparent. The population difference in anx- 
iety about aggression was measured indirectly 
by means of information concerning the in- 
hibitory socialization attitudes and practices 
indicated by the mothers of the Ss. Although 
there is substantial evidence in other re- 
searches that anxiety about aggression is pro- 
duced by inhibitory socialization of aggres- 
sion, measures of the Ss’ own responses which 
indicate anxiety about aggression would pro- 
vide a more direct test of the hypothesis. In 
addition, the fact that the total group of boys 
employed in this study was significantly more 
extrapunitive and significantly less intropuni- 
tive and less impunitive than Rosenzweig’s 
normative group makes the representativeness 
of the present sample suspect. 

One other potential qualification was em- 
pirically assessed. An alternative explanation 
of the present results may be based upon dif- 
ferential attenuation in range of scores on the 
aggression measures for the Low Anxiety and 
High Anxiety groups. However, differences in 
standard deviations were not statistically sig- 
nificant for any aggression measure. It does 
not seem likely that the consistently lower 
correlations for the High Anxiety group were 
based upon greater attenuation in range of 
aggression scores for this group. 


Summary 


The results of this study suggest that it is 
inaccurate to describe measures of person- 
ality as valid or invalid in a general or over- 
all sense. A single personality measure may 
possess different degrees of validity for dif- 
ferent populations. 

The hypothesis tested is that under condi- 
tions of high anxiety about aggression, the 
intercorrelations among various measures of 
aggression are significantly lower than under 
conditions of low anxiety about aggression. 

Anxiety about aggression was measured by 
assessing the inhibitory socialization behavior 
of the mothers of 10- to 13-year-old boys. 
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Scores obtained irom three techniques for 
measuring aggression in the boys (Rosen- 
zweig’s Picture-Frustration Study, a modified 
TAT, and a near-sociometric measure of overt 
aggression) were intercorrelated. 

All intercorrelations among the measures of 
aggression were greater (in the direction in- 
dicating greater validity of the measures) for 
a group of boys with low anxiety about ag- 
gression than for a group of boys with high 
anxiety about aggression; for four of the 
seven intercorrelations, this difference was sta- 
tistically significant. Also, correlations among 
the Picture-Frustration scores themselves in- 
dicated greater validity of these measures for 
the Low Anxiety group of boys than for the 
High Anxiety group. 

The study demonstrates that, in validation 
research, more exact specification is required 
in the determination of conditions under 


which consistency among various aspects of 
an S’s responding may be expected and the 
conditions under which no consistency or in- 
consistency may be anticipated. 
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REINFORCEMENT IN THE RORSCHACH 


LEONARD R. GROSS?! 


Veterans Administration Hospital, Roanoke, Virginia 


While there is some awareness of the dy- 
namic interactions between examiner (Z) and 
subject (S), the problem of just how and to 
what extent test results are influenced by 
these interactions have not been fully ex- 
plored. It has been demonstrated that Ror- 
schach responses can be influenced by ori- 
entational sets stemming from pretest prac- 
tice (Leventhal, 1956), pretest suggestion 
(Abramson, 1951), and conscious instruc- 
tions (Henry & Rotter, 1956), as well as by 
E differences (Sanders & Cleveland, 1953), 
but there are also reasons to believe that the 
E’s actions throughout the testing situation 
may affect Rorschach performance (Wickes, 
1956). 

It would be desirable to show that differ- 
ent cues given by the EZ are responded to 
without conscious awareness by the S, re- 
sulting in a changed Rorschach protocol. This 
study attempted to reinforce general human 
content on the Rorschach, using the verbal 
reinforcer good and the nonverbal reinforcer 
nodding. The Ss were randomly selected psy- 
chiatric patients. 

On the basis of previously mentioned stud- 
ies it can be hypothesized that: (a) the ver- 
bal reinforcer good will increase the fre- 
quency of the reinforced responses over that 
of a control group; (4) the nonverbal rein- 
forcer nodding will increase frequency of the 
reinforced responses over that of a control 
group; and (c) the verbal stimulus will be 
more effective than the nonverbal stimulus in 
increasing the reinforced responses. 


1 The author wishes to acknowledge his indebted 
ness to W. J. Eichman and B. M. Smith of the 
Roanoke, Virginia, Veterans Administration Hospital 
for their guidance and assistance 
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Method 


The Ss were selected from the psychiatric 
section of a university hospital on the follow- 
ing bases: (a) no history of organic brain 
damage, (4) a minimum of tenth grade edu- 
cation, and (c) no previous Rorschach ex- 
perience. They were randomly selected from 
both the inpatient and outpatient services 
and placed in one of three groups in the fol- 
lowing prescribed order: verbal reinforcement 
group (VR), nonverbal reinforcement group 
(NVR), or control group (C). 

The Ss were excluded from the study if 
they did not meet both of the following cri- 
teria: give at least one response involv- 
ing general human content (humans, human- 
like creatures, human anatomy) during the 
first two cards, and (5) give three responses 
per card for all ten cards. 

Out of the 46 Ss tested, 6 did not meet the 
criterion of three responses per card, and 10 
failed to produce one human response within 
the first two cards, leaving a total of 30 Ss, 
with 10 Ss in each group. The mean age of 
all the Ss was 34, with a range of 17 to 53. 
The sexes were evenly divided. The diagnoses 
were mixed and included neurotics, character 
disorders, and psychotics. The mean level of 
education for the 30 Ss was the twelfth grade, 
with 11 Ss having some college education. 
There were no significant differences in edu- 
cational level among the three groups. 

The Ss were presented with the complete 
set of Rorschach cards in the standard pro- 
cedure for the free association with one varia- 
tion in the Beck instructions, i.e., the inclu- 
sion of a sentence requesting three responses 
per card 

In the VR group, the E said “good” after 
each general human response. In the NVR 
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group, the EZ nodded his head once after each 
human response. In the C group, the cards 
were administered with the attempt not to 
offer any cues. 

Posttest interviews revealed that none of 
the Ss verbalized any awareness of the na- 
ture of the study. 


Results 


The mean number of general human re- 
sponses were compared for the three groups. 
As a result of heterogeneity of variance, a 
square root transformation was applied to the 
raw scores of the individual Ss. The means 
and variances of the transformed scores as 
well as the raw scores are found in Table 1. 
The analysis of variance of the transformed 
scores for the effect of reinforcement yielded 
an F significant at the .06 level. The groups 
were compared with each other by means of 
individual ¢ tests. One-tailed ¢ test was used 
as the direction was previously specified. The 
VR group gave more human responses than 
the C group at the .05 level of significance. 
The NVR group yielded more human re- 
sponses than the C group at the .02 level of 
significance. The VR and NVR groups did 
not differ significantly from each other. 


Discussion 


The results suggest that nodding or saying 
“good” will increase the frequency of pre- 


selected content 
situation. 

The first two hypotheses that the verbal 
reinforcer good and the nonverbal reinforcer 
nodding will increase the frequency of the 
reinforced responses appear to be substanti- 
ated. The third hypothesis that the verbal 
stimulus will be more effective than the non- 
verbal stimulus in increasing the reinforced 
response is rejected. The latter result is in 
conflict with previous studies using nonverbal 
cues (Taffel, 1955). A possible explanation is 
that the use of a flashing light as a nonverbal 
reinforcer in other studies was not perceived 
as part of the testing situation, while nodding 
was. 

The most obvious implication of the results 
is that one cannot discount even minimal or 
unconscious cues of the E when analyzing a 
Rorschach protocol. It follows that interpreta- 


Rorschach 


responses in a 


Table 1 


Raw and Transformed Scores of the Three Groups 


Square Root 
Raw Scores Transformation 


Group Mean Var Mean Var 


VR 
NVR 2 3.09 
SS ] 2.39 33 


2.86 3? 


tions of test responses and test behavior 
should not be considered separately but in 
light of the total situation. Both the E’s be- 
havior and the S’s conception of the testing 
situation are of import. 

While the results in this study are sugges- 
tive of some of the variables involved in the 
complicated interaction between tester and 
testee or interviewer and interviewee, it re- 
mains for future research to make experi- 
mentally clear other variables operating in 
such situations. It is probable that other as- 
pects of Rorschach can be rein- 
forced in a similar manner, but exactly which 
responses occur in sufficient number to be re- 
inforced as well as how much of a variation 


responses 


in E behavior is necessary to produce a 
change in a response level has not been an- 
swered. Another question is just how impor- 
tant are the variables that can be affected. It 
seems that if classical scoring methods are 
used these variables can have considerable 
effect on the dynamic picture. 


Summary 


The main interest was interpersonal rela- 
tions in a clinical situation. The study was 
designed to test the general hypothesis that 
E-S interaction is an important variable in 
test results. Thirty psychiatric patients were 
randomly selected and administered the free 
association of the Rorschach with the stand- 
ard instructions modified so as to elicit three 
responses per card. The Ss were then pre- 
sented with either verbal reinforment good, 
nonverbal reinforcement nodding, or no rein- 
forcement, whenever they gave a general hu- 
man content response. It was found that both 
the VR and NVR groups gave significantly 
more of the reinforced responses than the C 
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group. There were no significant differences 
between the two types of reinforcement. The 
findings suggest that cues given by the E can 
affect response categories. The necessity for 
avoiding interpretations of test protocols in 
vacuo was discussed. 
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A NOTE ON WITTENBORN’S FACTOR ANALYSIS 
OF RORSCHACH SCORING CATEGORIES* 


MITCHELL GLICKSTEIN ‘ 


Department of Psychiatry, University of Illinois College of Medicine 


In 1950, Wittenborn (1950a, 1950b) pub- 
lished two articles in which he presented the 
results of two factor analytic studies of Ror- 
schach scoring categories. In these reports, he 
discussed the correlation coefficients between 
scoring categories only in terms of the lim- 
ited range of the variables involved, and the 
positive skewness of these variables. How- 
ever, there are other statistical difficulties 
which tend to invalidate many of his conclu- 
sions. In both of Wittenborn’s studies, there 
is no attempt to deal with the statistical re- 
lationships which are imposed by the testing 
procedure itself. The present note is con- 
cerned with two of these: the effect on cor- 
relation coefficients of a variable number of 
responses (R), and the multiple scoring of 
single responses in different groups of scoring 
categories, within each of which the. scores 
are mutually exclusive. Of the two, the effect 
of a variable number of R is by far the more 
serious. 

On the basis of the data which Wittenborn 
presented, it is possible to illustrate the ex- 
tent to which these considerations influence 
the results. Wittenborn stated that different 
Rs might lead to a “general or quasi general 
factor of productivity” (1950a, p. 262). But 
this assumes that all of the correlation at- 
tributable to R would be confined to one ro- 
tated factor. Even if R influenced only the 
first centroid, his rotational procedure would 
cause it to affect all of the factors to a greater 
or lesser degree. This situation, in which the 
correlations between categories are inflated by 
the joint effect of R, is one example of spu- 


1This work supported by Grant M-1370, U. S 
Public Health Service 

2 Now at California Institute of Technology, Divi- 
sion of Biology 


rious correlation. Several authors have dis- 
cussed this problem in detail (Simon, 1954; 
Yule & Kendall, 1937; Zeisel, 1947). Most 
generally, it can be described as the inflation 
of a correlation between two variables which 
can be attributed to their mutual dependence 
on a third variable. 

In order to check on the effect of a variable 
number of responses on the obtained matrices, 
partial correlations were computed, holding 
R constant. This was done entirely from the 
data furnished by Wittenborn in his two 
studies. Tables 1-4 present the intercorrela- 
tion matrices before and after partialling out 
R. Tables 1 and 3 are the same matrices that 
Wittenborn published, with the exception that 
the intercorrelations with R are not included 
Tables 2 and 4 are these same matrices, with 
R partialled out. 

Figures 1 and 2 present these data in 
graphical form. The dashed lines are fre- 
quency distributions of the simple correlation 
coefficients from each of Wittenborn’s two 
published matrices, exclusive of the correla- 
tions with R. The solid lines are distributions 
of the same coefficients, with R held constant. 
Note that in both cases the mean of the dis- 
tribution of partial correlations closely ap- 
proximates zero 

The matrices of partial correlation (Tables 
2 and 4) reveal another artifact imposed by 
the test procedure, which is reflected in the 
correlation coefficients, and ultimately in the 
factor structure. The partial intercorrelations 
of any small number of mutually exclusive 
categories are forced to tend toward nega- 
tive value. The tendency would be most ob- 
vious in the case of two variables where, for 
example, the partial correlation between true 
items and false items would have to be 
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Fig. 1. Frequency distribution correlations before and after partialling out number of responses. Data from 
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— 1.00. This effect becomes less important 
the greater the number of mutually exclusive 
categories intercorrelated, which can be seen 
in the present matrices of partial intercorrela- 
tion (Tables 2 and 4). The pooled partial cor- 
relations between the five location categories 
in the two matrices have a median value of 
— .195, with 15 of these 20 values negative. 
A sign test shows this negative tendency to 
be significant at beyond the .025 level. The 
Rorschach determinant scores, in which the 
categories are mutually exclusive but more 
numerous, have a median intercorrelation of 
— .001, and these partials are not signifi- 
cantly negative. This problem of deflation of 
correlations by mutually exclusive categories 
is, of course, not resolved by the procedure 
of partial correlation. 

These considerations severely 


limit any 
confidence in the resultant f 


factor structure. 


Journal of Consulting ! 
Vol. 23, No. 1, 1 


The general inflation of the entire matrix by 
a variable R would be reflected in spuriously 
high communalities. The mutually exclusive 
categories would artificially decrease the cor- 
relations among variables and not 
others. 


some 
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SOME COMMENTS ON CONFOUNDED CORRELATIONS 
AMONG RORSCHACH SCORES 


J. R. WITTENBORN 


Rutger 


Glickstein has indicated that he believes 
that R, the total productivity score on the 
Rorschach, influences the correlation between 
other Rorschach scores in the manner of a 
common third (Glickstein, 1959). 
He then proceeds to examine the way in 
which this putative common third variable af- 
fects intercorrelations among Rorschach scor- 
ing categories; he does this by holding the 
variance of R constant through the device of 
partial correlation. He finds that a substan- 
tial portion of the resulting partial correla- 
tions carries a negative sign. He then ascribes 
the preponderance of negative partial correla- 
tions to the fact that Rorschach scoring pro- 
cedures involve the use of a “small” number 
of mutually exclusive categories. After assum- 


variable 


University 


ing that R inflates all the intercorrelations in 
the manner of a common third variable and 
after further assuming that the intercorrela- 
tions among the scoring categories are simul- 
taneously suppressed by the mutually exclu- 
sive qualities inherent in the Rorschach scor- 
ing procedure, he concludes that confidence 
in the result of factor analytic studies of Ror- 
schach scoring categories must be severely 
restricted. 

The writer’s views concerning the nature 
and significance of the R variable are dif- 
ferent from Glickstein’s. The-number of re- 
sponses in the various scoring categories de- 
termines R, and R does not delimit the num- 
ber of responses in any scoring category; 
accordingly, the writer does not consider it 
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useful to regard R as a common third vari- 
able. R is generated by the number of re- 
sponses in the various scoring categories, and 
the writer shares the interpretation of Coan 
(1956) and of Stotsky (1957) who see the 
relationship between any response category 
score and the total productivity score, R, as 
a part-whole relationship. With this inter- 
pretation, partialling R from the correlations 
between Rorschach scoring categories would 
be something like partialling the total score 
on a test from the correlations between the 
subtest scores or like partialling the full scale 
score on the WAIS from the intercorrelations 
among the various subtests. The writer does 
not have any questions about the Rorschach 
or any other test which require this kind of 
statistical exercise, but he suspects that such 
a partial correlation procedure would yield 
many negative partial intercorrelations among 
the WAIS subtests, just as it was found to 
yield negative partial intercorrelations among 
Rorschach categories. 

The effect of the mutually exclusive fea- 
ture of the Rorschach scoring procedure need 
not be as great as Glickstein imagines. If the 
correlations were based on a sample of per- 
sons who gave but one response each, there 
would have to be an inverse relationship be- 
tween any two scoring categories for that re- 
sponse, and as Glickstein notes, if only two 
scoring categories were available (instead of 
the numerous determinant and location cate- 
gories that are available), there would also 
have to be a negative relationship between 
the two scoring categories. Because of the di- 
versity of available determinant and location 
categories and because most subjects give nu- 
merous responses, it is not necessary for the 
intercorrelations between the categories to be 
small or negative, even in samples where R is 
held constant, e.g., in hypothetical samples 
comprising persons giving the same number 
of responses. Since the variance of R is gen- 
erated by the variance of the scoring cate- 
gories, it is not surprising that the correla- 
tions between the scoring categories are greatly 
reduced when the variance of R is eliminated 
from the correlations by the partial correla- 
tion procedure. In view of this, it would be 
difficult to ascribe any general significance to 


the fact that many of Glickstein’s partial cor- 
relations are small and negative. 

The writer’s interest in the Rorschach 
method was generated by clinical use, and 
during the 1940’s several studies were under- 
taken for the purpose of exploring some of 
the assumptions which were explicit in the 
method or were inferable from standard pro- 
cedures (Wittenborn & Sarason, 1949; Wit- 
tenborn: 1949a, 1949b, 1949c, 1950a, 1950b, 
1950c; Wittenborn & Holzberg, 1951). At 
that time it was recognized that the nature 
and the distribution of some of the variables 
do not always meet the exact requirements 
of some of the statistical procedures em- 
ployed, e.g., “Since the present investigation 
is concerned with the conceptual implications 
of the Rorschach scoring procedures, it was 
decided to study the scores directly with 
whatever imperfections they may possess and 
not attempt statistically to refine them or 
their intercorrelations” (Wittenborn, 1950a, 
pp. 263-264). 

When the study from which Glickstein 
draws his data was planned, it was hypothe- 
sized that the scores would be intercorrelated 
in such a way as to generate a general factor 
of productivity. Total number of responses, 
R, was left in the matrix in order to identify 
a factor of productivity and indirectly to fa- 
cilitate the identification of other factors. The 
correlations did not sustain the hypothesis of 
such a general factor. Nevertheless, a group 
factor of productivity was found, and a simi- 
lar factor has been described by other in- 
vestigators using other samples (Coan, 1956; 
Williams & Lawrence: 1953, 1954). The fact 
that R, the total productivity score, bears 
a part-whole relationship with various Ror- 
schach scores has been recognized by most 
investigators, and some of them have elimi- 
nated this distorting effect of R on factor 
patterns by eliminating R from the total 
matrix of correlations (Coan, 1956; Stotsky, 
1957). Both Coan and Stotsky explicitly rec- 
ognize that the nature of the Rorschach pro- 
cedure per se generates varieties of confound- 
ing among the Rorschach variables, and they , 
have been resourceful in attempting to con- 
trol these effects in their analyses. 

To describe some of the intercorrelations 
between Rorschach variables as spurious is 
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correct from the standpoint of the statistical 
denotation of the word spurious, but so nam- 
ing this effect does not reduce it or make the 
various sources of confounding any less real 
or make R a less useful part of the Rorschach 
method of personality analysis. At present, 
one has the unhappy choice of studying the 
Rorschach “as it is,” of studying it “as it 
isn’t,” or of ignoring it altogether. Perhaps 
someone will soon devise a new procedure 
which will incorporate the merits of the Ror- 
schach and eliminate some of the intricacies 
and confounding which make its proper sta- 
tistical study so forbidding. 
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THE SELECTION OF THE MENTALLY DEFICIENT 
FOR VOCATIONAL TRAINING AND THE EFFECT 
OF THIS TRAINING ON VOCATIONAL 
SUCCESS * 


LAWRENCE COWAN ann MORTON GOLDMAN 


University of Kansas City 


The problems of the mentally deficient 
have recently undergone a period of exten- 
sive examination. One of these problems that 
is receiving the most stress at this time is 
in the area of vocational adjustment (Engel, 
1950). The present study examines the rela- 
tionship of training to this vocational adjust- 
ment and explores several criteria that could 
possibly be used to predict the success or 
failure of the mentally deficient in achieving 
this adjustment. 

Some authors (Engel, 1950; 


Keys & 


Nathan, 1932) have expressed the opinion 
that training is of questionable value for the 
mentally deficient unless it is specially fo- 


cused toward a definite job. However, the 
sheltered workshops available in some com- 
munities, giving a more global type training 
with awareness of factors dealing with per- 
sonality adjustment, have done a great serv- 
ice for the mentally deficient in affording 
them an opportunity to receive training and 
helping them to achieve a moderate degree of 
vocational success (Gellman, Gendal, Glaser, 
Friedman, & Neff, 1956: Weinberg, 1955). 
As a source of training, these agencies are 
limited by the community’s effort to expand 
them. At the present time they are quite in- 
adequate to handle the number of mentally 
retarded and deficient who are in need of vo- 
cational training (Gellman et al., 1956). 

By utilizing training cases from other than 
workshop sources, it will be the intention of 
this study to examine whether these other 


1This paper is a condensation of a dissertation 
(Cowan, 1957) submitted to the Graduate School of 
the University of Kansas City by the first author 
The second author served as advisor 


78 


places of training can adequately prepare the 
mentally deficient for eventual job placement. 
If this is so, then the community will have 
other resources to expand in helping the men- 
tally retarded and deficient to receive voca- 
tional training. 

The first hypothesis (H,) this study ex- 
amines is: The mentally deficient, as a re- 
sult of vocational training, will be able to 
get and keep a job more successfully than a 
matched nontrained group. This hypothesis 
may seem somewhat naive, since one is in- 
clined to think that any trained person would 
be more successful than one not trained in 
getting and keeping a job. But this is not 
necessarily so, for at one time it was thought 
that the jobs a mentally deficient person 
could do were so simple as to warrant no 
training, and, if training were given, it would 
not appreciably benefit that individual (Engel 
1950). 

The second hypothesis (Hz) that will be 
examined is: The ability to get and keep a 
job for the mentally deficient (IQ level 40 
79) varies directly as the IQ level 

The concepts of intelligence and IQ have 
of necessity undergone many changes in em- 
phasis during the past years, swinging from 
no credence placed in the terms to attempting 
to adjust an individual’s educational and vo- 
cational development by using the “intelli- 
gence quotient.” Fortunately for many, the 
use of the IQ score has shifted back toward 
the middle of these extremes. There is one 
area, however, that has been noticeably slow 
in using a more operational concept of intelli- 
gence—this has been in the vocational coun- 
seling of the mentally retarded and deficient. 
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It is here, perhaps more than anywhere else, 
that the IQ score has been “glorified.” It 
should be pointed out that most professional 
books and articles in the field of vocational 
counseling of the mentally retarded warn 
against adopting the IQ score as categorically 
placing the individual into a class of potential 
achievement (Mausner, 1953). This warning 
is continually being voiced due to the large in- 
cidence of workers doing just that. It is com- 
mon to find vocational counselors with the 
opinion that given two mentally deficient in- 
dividuals, one with a tested IQ of 55 and the 
other with an IQ of 75, the one with the 
higher IQ will make a better vocational ad- 
justment due to the difference in IQ points! 

The third hypothesis (H;) is concerned 
with the examination of several other factors 
which are purported to play a direct role in 
predicting vocational success. It states: Voca- 
tional success can be predicted from effort 
made in attempting to locate a job, formal 
education, and past work experience. A vo- 
cational rehabilitation counselor must make a 
judgment as to the potential of the mentally 
deficient client prior to accepting him for 
services. In doing this he always uses the in- 
formation available. It seems reasonable that 
a counselor might consider a person who has 
made little or no attempt to find a job as not 
having the vocational potential that one who 
has exerted more effort will have. Further, the 
counselor might think that the more educa- 
tion achieved, the hetter the chance for suc- 
cess. Also, he might think that if the indi- 
vidual has some work experience, he will make 
a better adjustment to training on a new job. 
The work history is probably the one factor 
that is most important to the vocational coun- 
selor, for it is here that the counselor feels 
he can best discover the adaptability of his 
client. All these factors taken as a whole give 
the counselor an opinion as to the potential 
of the individual for vocational success. The 
IQ score and the case history, which give him 
information of the nature stated in the third 
hypothesis, are used to arrive at a decision as 
to whether there is a “reasonable expectation” 
that the individual can be vocationally re- 
habilitated. It is true that these factors of 
IQ, work history, etc. are never taken sepa- 
rately but are examined in relationship to the 


total picture of the individual. It is the au- 
thors’ view that, regardless of the attempt to- 
ward a total picture, this picture will not 
yield reliable information because the criteria 
used may not be valid. If the value of train- 
ing of the mentally deficient is to be judged 
by vocational counselors, it might be neces- 
sary for them to re-evaluate their criteria. It 
may be that techniques of making decisions 
which relate to the advisability of training 
the mentally deficient will need to be ad- 
justed drastically as more up-to-date infor- 
mation becomes available. 


Method 


As a first step, the files of a district office 
of vocational rehabilitation * were sorted for 
all the mentally deficient clients given train- 
ing since the agency opened its office in 1952. 
These individuals constituted the experimen- 
tal group. Those persons who were referred 
with mental deficiency plus a physical handi- 
cap were ruled out since the physical aspects 
of their disability would contaminate the re- 
sults of the study. When this sorting was 
complete, there were 22 individuals who met 
the requirements of the experimental group, 
that is, mentally deficient (congenital defect) 
with no accompanying physical disability, 
who received vocational training. Twenty were 
available for the complete work-up; the other 
two had moved from the city after they had 
received their training. The borderline upper 
limit of retardation or deficiency in IQ as 
stated by Wechsler (1944) was used as an 
upper cutoff score, that is, an IQ of 79. Since 
no individual with an IQ of less than 40 was 
accepted for training, this constituted the 
lower limit. 

A questionnaire was developed to use in 
the follow-up interview of this study. It was 
designed to yield the following information: 
name, age, sex, race, education, 
source, IQ, date of IQ testing, date of con- 
tact, type of training, where training was 
given, and for how long. If employed since 
the time of training, information was ob- 
tained as to what type of employment, for 
how long the job was held, and how the job 
was obtained. 


referral 


2 The Office of Vocational Rehabilitation, Kansas 
City, Missouri 
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When this information was gathered on all 
20 cases, a comparable group of mentally de- 
ficient individuals was taken from the files of 
a private rehabilitation agency to correspond 
to the experimental group. These people were 
selected on the same criteria as the first group 
from a total of 53 cases. Twenty people were 
selected who matched the experimental group 
as closely as possible on sex, age, education, 
race, IQ, and past employment. The major 
difference between these groups was that the 
second group (the control group) had re- 
ceived no vocational training, whereas the 
first group (the experimental group) had re- 
ceived vocational training. These individuals 
in the control group were referred to the pri- 
vate rehabilitation center for IQ tests.* These 
same people had never heard of the Office 
of Vocational Rehabilitation, and they were 
asked if they would have gone to Vocational 
Rehabilitation had they known about the 
services available there. Their answers were 
in the affirmative. It is thus assumed that the 
difference in referral source between the con- 
trol group and the experimental group is one 
of lack of knowledge concerning the services 
available, and that the placement in the ex- 
perimental or control group was a result of 
chance factors operating in the environment. 

Following the selection of each group, all 
the members of the groups were interviewed 
personally, and the questionnaire was ad- 
ministered. Also, the mothers of all the sub- 
jects (Ss) were interviewed in order to avoid 
the possibility that the individuals with higher 
IQs could give more complete information. 

The results obtained for the experimental 
and control groups on the matching variables 
were as follows: There were 15 males and 4 
females in the experimental group and 14 
males and 4 females in the control group. All 
the Ss in the study were white. The mean 
age, IQ, and grade for the experimental group 
were 20.8, 64.7, and 4.15; the correspond- 


8 The question can be raised whether all of the 
control group would have been acceptable to the 
Office of Vocational Rehabilitation for vocational 
training. When the records of the office used in this 
study .were examined, they showed that the total 
number of mentally retarded and deficient referrals 
was 28. Since 22 of them were accepted, it is as- 
sumed that the majority of the control group would 
also have been accepted. 


ing scores, in the same order, for the control 
group were 22.9, 61.2,.and 5.15. The differ- 
ences between the standard deviations of these 
corresponding measures were small. The mean 
time that elapsed between time of testing and 
contact for the experimental group was 3.1 
years and for the control group was 2.7. And 
finally, the number of cases with no past work 
history was 12 for the experimental group and 
the same number for the control group. 

The 40 cases (20 in control and 20 in 
experimental) were interviewed to discover 
what had transpired since the time of train- 
ing for the experimental group and since the 
time of testing for the control group. Infor- 
mation was obtained as to which individuals 
had secured employment and how long they 
had remained employed, whether those em- 
ployed of the trained group had found jobs 
in the area of their training, and whether they 
had been successful in making a vocational 
adjustment. An adequate vocational adjust- 
ment was said to exist if the individual had 
held a paying position for at least 12 months. 
For the purposes of this study, a paying po- 
sition consisted of employment where the in- 
dividual was receiving at least the minimum 
wage set by the laws of the state * and was 
working enough hours during the week to re- 
sult in a salary that was satisfactory to him. 
It should have afforded him a sufficient in- 
come to care adequately for his needs. 


Results 


After all questionnaires were completed, the 
data was combined and yielded the following 
over-all information: Of the 20 experimental 
cases (those receiving vocational training), 12 
were employed at the time of contact and had 
been in that status for at least one year. They 
were receiving an adequate income as defined 
in the preceding paragraph. All 12 of these 
cases were placed in their jobs by the voca- 
tional counselor.’ The successful group con- 
sisted of 9 males and 3 females which corre- 
sponds identically with the three to one ratio 
of males to females in the total group (15 
males, 5 females). The mean age of the suc- 
cessful group was 21.75 years which does not 


#In this case, $1.00 per hour. 
5 The implications of this will be discussed below 
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Table 1 


Effect of Training on Vocational Success 


Status 


Trained Untrained Total 


Unemployed s 16 24 
Employed 4 16 


Total 20 20 40 
Note x? = 8.437 (corrected for continuity); » < .001. 


vary significantly from the total mean age of 
the group (20.8 years). 

The same criteria for assessing vocational 
success were applied to the control group 
(those receiving no training). Of these 20 
cases, 4 had achieved a successful vocational 
adjustment. To test the significance of the 
relationship of training and employment, the 
chi-square test (corrected for continuity) was 
used.® The results of this test are shown in 
Table 1. 

It can be seen from Table 1 that the prob- 
ability of these results occurring by chance 
alone is less than one out of a thousand. 
These results were considered to be highly 
significant, and H, is supported. 

To test He, the mean IQ of those indi- 
viduals who achieved a successful vocational 
adjustment in both groups was compared with 
the mean IQ of those in their respective 
groups who were unsuccessful. Table 2 shows 
the mean and standard deviation of the IQs 
of the successful and unsuccessful Ss in each 
group. 

When the means were submitted to the ¢ 
test, it was discovered that the difference 
between the mean IQs of the successfully 
trained and the unsuccessfully trained did not 
vary sufficiently to warrant the statement that 
they were from different populations. Like 
wise, the difference between the two means of 
the untrained was not sufficient to state they 
were from different populations. In other 
words, the differences of IQ between the suc- 
cessful and unsuccessful in either the trained 
or untrained group were not statistically sig- 
nificant. Further, the differences between the 
successful in both groups as compared to the 
unsuccessful in both groups were not signifi- 


6 For all statistical methods used in this study, the 
05 level of confidence will constitute significance. 


Table 2 


IQ of Successful and Unsuccessful Subjects 


Subjects XxX 
[Q of successful, trained 63.9 
66.0 
61.0 
61.2 


IQ of unsuccessful, trained 
IQ of successful, untrained 
IQ of unsuccessful, untrained 


cant. These findings then do not support Hy. 

In the statistical examination of the third 
hypothesis, it was necessary to test each of 
the three factors separately. The first factor 
examined was formal education. Table 3 shows 
the mean level of education in terms of grades 
and the standard deviation for the groups. 

The means were submitted to the ¢ test and 
the findings were: The difference between the 
means of educational level of the successfully 
and the unsuccessfully trained was not sig- 
nificant at the .05 level. The same results 
were found for the difference between the 
mean level of education of the untrained 
group. The difference between the means of 
the successful and unsuccessful Ss, irrespec- 
tive of training, was submitted to the ¢ test, 
and the difference was also not significant at 
the .05 level. 

The factor effort exerted to find employ- 
ment was examined next. One point was given 
to an S for each attempt made to find em- 
ployment. Comparing the median effort ex- 
erted by each group showed that the trained 
group had a median effort of 1.09, and the 
untrained group had a median effort of .50 
The median test was used to test for signifi- 
cance. It showed that the difference in effort 


Table 3 


evel of Education for the Successful 


and Unsuccessful 


Level 


trained 
Level of educati 


»f education of successfu 

unsuccessful 
trained 

Level of educati 
untrained 

Level of educatior 
untrained 


successiul, 


unsuccessful 
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between the trained group and the untrained 
was highly significant (p < .001). However, 
it must be pointed out that to receive train- 
ing it was necessary for the trained group to 
go to the Office of Vocational Rehabilitation. 
This automatically gave them one point for 
effort. Since the combined median of both 
groups was .93, all the Ss in Group I were 
over the median. This accounted for the high 
significance in the trained group. This sta- 
tistical test thus supports the aspect of H, 
that states effort *s indicative of vocational 
success. 

The third factor of Hs examined was the 
relationship of work history to vocational suc- 
cess. Since the two groups were matched for 
the number having previous work experience, 
the support of H, established the fact that 
work history was not pertinent. Each group 
had 12 Ss who had past work experience. 
What remained to be explored was whether 
those individuals in each group who achieved 
vocational success consisted of a statistically 
significantly higher number who had had past 
experience. In each group, the number of 
persons with past experience was identical to 
the number without. For the 12 successfully 
trained, 6 had had past experience and 6 had 
not. These results indicated that there was 
no difference between having had or not hav- 
ing had past work experience and achieving 
vocational success. This finding, then, did not 
support the third factor of Hg. 

The final result of the study examined was 
the length of time between the time of IQ 
testing and contact. It already has been 
stated that this time for each group was 
averaged, yielding a mean elapsed time for 
the training group of 3 years, 1 month. The 
elapsed time for the control group was 2 
years, 7 months. If the average time spent in 
training for Group I (4.8 months) is sub- 
tracted from the average time elapsed for this 
group, a corrected figure of 2 years, 9.2 
months is reached. This constitutes the time 
available for Group I to find employment. 
When these figures were compared to the 
mean time elapsed for Group II to find em- 
ployment (2 years, 7 months), it was found 
that both groups had approximately the same 
length of time available to achieve vocational 
success as defined here. 


Discussion 


The primary concern of this study is the 
relationship of training to the vocational suc- 
cess of the mentally deficient. The results of 
the statistical analyses in the preceding sec- 
tion indicates that those receiving training 
made a better vocational adjustment than 
those who did not. When the variations of 
IQ, age, education, etc. were controlled, the 
results were still the same. Training made the 
difference. But training to the mentally de- 
ficient means more to them than just an op- 
portunity to learn new skills. It means that 
someone is interested in them, someone is 
there to encourage their efforts and to help 
them handle the disappointments and frustra- 


tions that arise. They may not have received 


this support at home. Under a healthier at- 
mosphere the mentally deficient, as well as 
anyone, can and will learn more and produce 
more. This brings us to the subject of the 
home life of the mentally deficient and how 
it relates to vocational adjustment. Talking 
with the mothers of these 40 cases brought to 
light some interesting differences between the 
two groups. These differences were not objec- 
tively measured in this study, but they mani- 


fested themselves too frequently to be ig- 


In talking to the mothers of the cases 
in the control group (no training), it was 
commonplace to hear them say, “John can’t 
work, he’s not bright enough. Everyone would 
take advantage of him. He couldn’t get along 
by himself—he’s much better off here at home 
where I can watch him. Oh, yes! He does a 
lot of work around the house.” This quotation 
is a composite of remarks from several moth- 
ers, but it represents the general feelings held 
by the mothers of Ss in this group. This atti- 
tude on the part of the mother may have 
come after John showed no inclination for 
outside work and was used defensively by the 
mother. It may have come before John was 
old enough to work and represented the pro- 
tective shield that had surrounded him all his 
life. Either way, however, he is receiving no 
encouragement to accomplish on his own or 
to prepare himself for the time when he may 
be on his own. It was striking to compare 
these attitudes with those of the mothers of 
the group that received training. These moth- 


nored 
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ers, although they had the same fears as the 
other group, were working as much as pos- 
sible for the rehabilitation of their sons and 
daughters. This attitude might have become 
manifest due to the abilities shown by the 
mentally deficient in their training, but it 
may be more indicative of the nature of the 
environment in which they were raised. Per- 
haps the attitude of helpfulness and encour- 
agement results from parental ability to ac- 
cept the mentally deficient as an individual 
and implies no attempt to push him beyond 
his capacities nor to shelter him to such an 
extent that his capabilities are never discov- 
ered. Much has been said in the literature 
about the need to accept the physically and 
mentally disabled, but how does one accept 
a son or daughter who is unconsciously being 
rejected? The answer of course is that the in- 
dividual is not accepted. There is more than 
a razor’s edge between intellectual under- 
standing and emotional acceptance. 

The foregoing constitutes an attempt to 
understand some of the problems of the men- 
tally deficient. Application of these thoughts 
to the experiment reported on here, however, 
can lead to some practical observations. The 
trained group did achieve a better vocational 
adjustment than the nontrained. Why this is 
so is still not too clear. Are those selected for 
training in a better position to profit from 
this emotionally, or does the training itself 
serve as a vehicle for the expression of un- 
developed abilities? This question leads to 
need for more experimental evidence to es- 
tablish the relationship of emotional develop- 
ment to vocational and social success. From 
the results of this study, it appears that many 
mentally deficient individuals would profit 
from vocational training. This training should 
of course be adjusted to fit the abilities and 
requirements of the individual. The individu- 
als who had a healthy psychological home life 
could benefit from the combined effects of the 
encouraging atmosphere plus the new skills 
acquired. The mentally deficient are intellec- 
tually immature and, as is often the case, 
emotionally immature as well. They are more 
dependent upon their parents, and if a train- 
ing program is to be effective, the agency re- 
sponsible for the training may also have to 
work with the parents. 


For along time now the relationship of 
tested IQ to job success has been known to 
be uncorrelated (Brainerd, 1954). The IQ 
test is an important tool in psychology, as 
well as in the field of education and counsel- 
ing, with the mentally deficient. However, 
using IQ scores as a predictive device for vo- 
cational success is not supported by the evi- 
dence in the present study. 

The formal education of the mentally de- 
ficient should also be given some attention. 
Like the IQ score, its use as a predictive de- 
vice for vocational success is not supported 
by the evidence in this study. 

Measuring the effort exerted by the Ss to 
find employment produced some problems. It 
was intended in using this measure to tap 
all methods of securing employment. One of 
the problems, thus, was to decide whether Ss 
in Group I (the experimental group), who 
were all trained and placed through the Office 
of Vocational Rehabilitation, should receive 
credit. Strictly speaking, making contact at 
the Office of Vocational Rehabilitation was 
an attempt to secure employment, and there- 
fore the Ss in Group I all were allotted at 
least one point. However, the question can be 
raised as to whether this method of obtain- 
ing employment was comparable to the other 
methods of finding a job used by the control 
group, Group II. In one respect the individual 
assumed much more responsibility by cooper 
ating with the plans formulated with the vo- 
cational counselor, but he also does not have 
the problem of finding a job opening on his 
own. This was done for him by the vocational 
counselor or the placement service. Because 
of these difficulties in measuring the effort 
exerted in finding a job, it cannot be said that 
a relationship has been found in this study 
between the effort exerted in locating a job 
and ultimate vocational success. It might even 
be possible to assume that it was the job 
placement that accounted for the success of 
the trained group rather than training.’ 

In the experimental proup, the group that 
received training, we note that 8 of the 20 
still remained unemployed. Our data shows 


7A study is presently under way which will com 
pare two groups of Ss—one group given placement 
aid and training, the other group given only place- 
ment aid 
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that these “failure” cases cannot be distin- 
guished from the “successful” cases by IQ, 
age, previous education, sex, or previous work 
history. How then can we account for these 
“failure” cases? The answer to this question 
will, of course, demand much additional re- 
search. The study reported on here had too 
few cases and was not designed to explore 
this area. However, certain suggestions can 
be made. One area that might be examined 
is the personality of the trainee in combina- 
tion with the personality of the instructor, 
and the relationship established between the 
instructor and trainee. Another factor might 
be the type of training given and the demand 
that exists for people with this type of train- 
ing. And finally, of high importance would be 
the effort and ability of the placement coun- 
selor who works with the trainee trying to 
locate a position for him. 


Summary 


This study dealt with the relationship of 
training to the vocational success of the men- 
tally deficient. Twenty cases of mental de- 
ficiency with no accompanying physical dis- 
ability who received vocational training from 
the State Department of Vocational Rehabili- 
tation were matched with 20 nontrained men- 
tally deficient individuals for age, sex, race, 
IQ, education, past work experience, and 
elapsed time since the time of their IQ testing. 

Administering a questionnaire, it was dis- 
covered that the trained group had a signifi- 
cantly larger number of vocationally success- 
ful individuals than the nontrained group. 
Further, this success was unrelated to their 


IQ level, formal education, or past work ex- 
perience. Since the trained group was placed 
in their jobs by a vocational counselor, the 
results may be influenced by the placement 
itself, which is an integral part of rehabilita- 
tion. The measurement of effort scale showed 
several weaknesses in operation, and no as- 
sumptions concerning the relationship be- 
tween effort and job success seemed feasible. 
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The ever-increasing number of chronic neu- 
ropsychiatric patients represents perhaps the 
foremost problem in the field of mental health 
today. Despite recent advances, notably in 
the fields of chemotherapy and rehabilitation, 
the burden of the chronic population grows, 
threatening to overwhelm existing hospital 
facilities and available professional person- 
nel. This study was conceived as the first 
step in a three-part plan to: (a) develop an 
index predictive of chronicity; (5) attempt 
to isolate and study the determinants of 
chronicity; and (c) develop and test the ef- 
fectiveness of retraining programs based on 
the defining characteristics of the chronic 
population. 

The possibility of a predictive index was 
suggested by a study previously completed 
at the Perry Point Veterans Administration 
Hospital (Giedt & Schlosser, 1955). An 
analysis of the time sequence of discharge 
rate for the patient population revealed that 
61% of admitted patients left the hospital 
during the first 90 days, 25% during the next 
15 months, and only 2% during the remain- 
ing 24 months covered by the study. A pre- 
dictive index, based on demographic charac- 
teristics ascertainable at the time of hospital 
admission, would permit a comparison of the 


1 The authors wish to acknowledge their indebted- 
ness to John L. Holland, Albert Pepitone, and Rich- 
ard Sanders for their contributions in the original 
formulation of this study. 

2 Now at VA Hospital, Palo Alto, California. 

8 Now at the Philadelphia State Hospital 


characteristics of the “quick discharge” popu- 
lation with the potential chronic population. 
In addition, it would facilitate the early, com- 
prehensive study of the potential chronic 
population so that the characteristics of those 
who finally did leave the hospital could be 
compared with those who remained indefi- 
nitely. Such an early study, before the level- 
ing effect of hospitalization on social charac- 
teristics and attitudes became operative, was 
deemed particularly desirable. 

The extensive literature (see, for example, 
Zubin, in press) on the prognosis of length of 
hospitalization was reviewed and the follow- 
ing variables selected for study: 


. Number of previous hospitalizations 
. Age at first neuropsychiatric hospitalization 
Age at time of current hospitalization 
. Service-connection of disability (per cent) 
. Length of military service 
Age entering service 
Months service prior to first NP hospitalization 
. Education 
. Religion 
. Marital status 
. Number of children 
. Combat experience (Yes or No) 
. Neuropsychiatric diagnosis in service (Yes or 
No) 
. Diagnosis 
. Severity of external precipitating stress 
. Predisposition 
. Degree of incapacity 
. Secondary diagnosis (NP—Yes or No) 
. Legal competence (Yes or No) 
. Alcoholism (History of—Yes or No) 
. Occupational classification level 
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Method 


Data for this study were derived from the 
clinical record folders of all male neuropsy- 
chiatric patients admitted to a large Veter- 
ans Administration hospital during the period 
from July 1, 1954 to December 31, 1954. 
With the exception of length of hospital stay, 
the criterion, all of the information used in 
the index was obtained from the initial psy- 
chiatric summary, which is completed within 
three weeks of admission, and from demo- 
graphic data which are recorded immediately 
upon admission. The final total sample for 
the 1954 period consisted of 248 cases. A 
small percentage (less than 5%) of the clini- 
cal record folders were unavailable because 
they had been transferred to other hospitals 
or because they were otherwise unobtainable. 
Where patients had been admitted more than 
once during the period, data from the first 
admission only were included in the sample. 

Information on the 21 variables that had 
been selected for study was recorded and the 
sample was divided into those who were hos- 
pitalized for 90 days or less (Short Stay 
group, NV = 120) and those hospitalized for 
91 days or longer (Long Stay group, NV = 
128). The high frequency of discharges found 
during the first 90 days in the earlier study 
(Giedt & Schlosser, 1955) suggested that 
this cutting point might have psychological, 
as well as statistical, significance. The two 
groups were then examined for differences 
within the 21 categories, and the significance 
of differences between groups evaluated by 
chi square. 


Results 


Of the 21 variables, four served to differ- 
entiate the two groups beyond the .001 level 
of confidence. These were: diagnosis, degree 
of incapacity, legal competence, and alcohol- 
ism. Marital status differentiated the two 
groups beyond the .01 level of confidence * 
and combat experience beyond the .05 level. 
A seventh variable, per cent of service con- 
nection, differentiated the groups between the 
OS and .10 level of confidence. Finally, an 

*Two other variables significant beyond the 01 
level, external precipitating stress and predisposition, 
were eliminated because of their high correlation 
with diagnosis, which was retained in the index 


eighth variable, occupational classification, 
significant beyond the .30 level, was retained 
in the original predictive index because of the 
special significance that has been accorded 
it (Frumkin, 1955). 

A score for each item within the eight vari- 
ables was computed following a method de- 
vised by Moran, Fairweather, Morton, and 
McGaughran (1955). This method essentially 
involved computing for each item the prob- 
ability of a patient’s falling in the Long Stay 
group. The log of each probability value was 
then ascertained. Each patient was then as- 
signed a score which was the sum of the 
eight log values that applied to him. 

Scores thus computed for all cases ranged 
from 5.75 to 2.25. Below an arbitrary cutting 
point of 4.25 were 75 cases. Of these, 85.3% 
were in the Short Stay group. Above a cut- 
ting point of 2.75 were 89 cases. Of these, 
85.4% were in the Long Stay group. This 
index, then, served to predict long or short 
stay for 66.1% of the sample with a high de- 
gree of accuracy. In addition, it allowed for 
the specification of those cases for which ac- 
curate prediction could not be made. 

Prediction for this group of 1954 admis- 
sions might have been spuriously high, of 
course, since the index was constructed from 
the very cases being predicted. Therefore, the 
index was cross-validated on a sample of 1955 
admissions. 


Cross-Validation 


Data were collected for all male neuropsy- 
chiatric patients admitted to the same hos- 
pital during the period January 1, 1955 to 
June 30, 1955. The same scoring procedures 
and cutoff points were employed with this 
sample of 209 cases. Below the cutting point 
of 4.25 were 69 cases. Of these, 87.0% were 
in the Short Stay group. Above the cutting 
point of 2.75 were 76 cases. Of these, 78.9% 
were in the Long Stay group. Thus the index, 
based on 1954 admissions, could have pre- 
dicted length of stay for 69.4% of the 1955 
admissions with 82.8% accuracy. 

In calculating the differential power of the 
individual variables on the cross-validation 
sample, it was found that three of the eight 
variables did not reach the .05 level of sig- 
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Table 1 


Probabilities of Long Stay Given by Each of the Variables in the Revised Form of the Index 


Chi 
Square 
Variable 


Marita] status 


Diagnosis 


Degree of incapacity 


Legal competence 


Alcohol 


* N differs for each variable because occasionally informat 


stances, the score use 
bIn the case of Degree of Incapacity, “not part of diagn 
disorders, where this is not included in the formal diagnosis. 
nificance. These were: occupational classifica- 
tion, service connection, and combat experi- 
ence.® Accordingly, these variables were not 
included in the final, revised form of the index. 


The Revised Index 


The final form of the index was based on 
a combination of both samples, embracing all 
admissions during the period July 1, 1954 to 
June 30, 1955. Scores on the final form were 
derived from the remaining five variables ex- 
actly as in the original form. The individual 
probabilities furnished by each variable are 
given in Table 1. 

The 


shown in Fig. 1. 


scores of each group distribute as 
The validity of this index, i.e., the relation- 
ship between scores and short and long stay, 


clearly depends on the cutting points selected. 


5Two of these variables, service connection and 
combat experience, were ones in which there was an 
unusually high number of “unknowns.” In such cases 
modal scores were assigned. It is suggested that these 
variables may prove significant in future studies in 
volving veterans where the data may be gathered 
with a higher degree of precision than was possible 
in this “file” study 


Married and widowed 
Separated and divorced 
Single 


Organic 

Psychotic 

Neurotic 
Character disorder 


None, mild, moderate 3 9 
Severe 9 
Not part of diagnosis' 


Competent 
Inc ompetent 


Yes 37 9 
No 


ion 


n the index is the log of the modal p value of that 


Probability 
of Long 
Stay 


Log 
Cutting Points Values 
4 


40* 
43 
66 


59988 
63548 
81624 


10 
10 
10 


59550 
89321 
48287 
9.01703 


10 
10 
10 
10 


58320 
91169 
02119 


10 
10 


9 10 


9.51983 
).94498 


10 
10 


56585 


10 
9.81023-10 


ym one iter 
ter 


refers to those 


Predictions might be made for every case by 
cutting above and below in which case 
the index would yield 77.2% accuracy. Cut- 
ting scores might be established below 2.1 
and above 2.9. In this case, 172 Short Stay 
cases would be predicted with 84.3% accu- 
racy, and 153 Long Stay cases would be pre- 
dicted with 87.6% accuracy. Thus 71.1% of 
all cases could be predicted with 85.8% accu- 
racy. If homogeneity rather than size of sam- 
ple is emphasized, cutting scores might be 
established below 3.7 and above 1.1. In this 
case, 102 Short Stay cases would be predicted 
with 91.2% accuracy, and 116 Long Stay 
cases would be predicted with 91.4% accu- 
racy. Thus prediction is possible for 47.7% 
of the sample with 91.3% accuracy. The va- 
lidity of the index might also be expressed di- 
rectly in terms of the correlation between in- 
dex scores and short and long stay. A biserial 
correlation of 


a 


2 
%¢ 


.759 was obtained. 


Discussion 
By means of this index, samples of a given 
size and with a given probability of Jong or 
short stay may be selected from the patients 
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entering the hospital under study. Generaliza- 
tion of these results to other neuropsychiatric 
hospitals must, of course, await further cross- 
validation studies. It should be pointed out 
that none of the variables in the final form 
of the index are restricted to a veteran popu- 
lation, and that it has potential applicability 
in any neuropsychiatric hospital. 

The value of this index for research lies in 
its usefulness for the economical preselection 
of stratified samples for intensive study dur- 
ing the course of hospitalization. Homogene- 
ous populations of potential long or short 
stay patients may be selected for immediate 
study or for assignment to treatment pro- 
grams for evaluative purposes. 

The index is also seen to have immediate 
service value. In those hospitals where an in- 
tensive continued treatment program is avail- 
able, those patients who fall in the Long Stay 
group could be transferred to such a program 
immediately upon completion of the routine 
admission procedures. Correspondingly, short- 
term therapeutic and counseling procedures, 
designed to be completed on the Admissions 
Service, could be instituted without delay for 
those patients in the Short Stay group. 

In addition to the usefulness of the index 
for prediction, the variables themselves al- 
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low for certain speculations concerning their 
psychological implications. The finding that 
length of stay for single patients exceeds that 
for others suggests a possible differentiating 
adaptive factor. The other group, which would 
include the married, widowed, separated, and 
divorced persons, have at some time in their 
lives attracted and formed a relationship with 
another person, and thus may possess greater 
personal resources. An alternative explanation 
is that the single patients are less likely to 
have an established setting in the community 
to which to return. However, this explanation 
fails to account for the fact that separated 
and divorced patients achieved approximately 
the same probability of short stay as did 
married and widowed. 

It should be pointed out that history of 
alcohol was scored in the study wherever it 
was mentioned as contributory to the current 
need for hospitalization, as well as where it 
was an established diagnosis. Several possible 
explanations can be advanced for the highly 
significant relationship between alcoholism 
and “short stay.” In part, it may reflect a 
group of patients who ordinarily maintain 
adequate integration, but whose defenses can 
be temporarily weakened by alcohol, necessi- 
tating short-term hospitalization. On the other 


7 SHORT STAY (N=225) 
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Fig. 1. Distribution of scores for short-stay and long-stay cases on the revised index 
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hand, it may reflect the inappropriateness of 
current hospital treatment for the alcoholic. 
A further theoretical interpretation might be 
that the use of alcohol as a means of han- 
dling anxiety is generally not associated with 
the development of perseverant psychotic 
symptoms. 

It is of interest that although psychiatric 
diagnoses have been shown to be of doubtful 
reliability (Mehlman, 1952), their implica- 
tions for length of hospital stay have been 
demonstrated to be valid predictors in this 
study. This is seen in three of the final vari- 
ables (Diagnosis, Degree of incapacity and 
Legal competence) which reflect the diag- 
nostic judgments of psychiatric observers. In 
this case these judgments seem to reflect two 
underlying determinants: severity of illness 
of psychotics and the previously mentioned 
inappropriateness of treatment in the general 
neuropsychiatric hospital for alcoholics and 
character disorders. While the argument could 
be raised that these initial judgments merely 
predetermined the patients’ course of hospital 
treatment (short or long), the accuracy of 
their predictive value suggests the more par- 
simonious explanation that the judgments 


genuinely reflect the patients’ need for treat- 
ment in a neuropsychiatric hospital. 


Summary 


The potential value for research in chronic 
mental illness of an instrument for the early 
prediction of length of hospital stay was in- 
dicated. Data on 21 demographic variables, 


in NP Hospital . 89 


available within three weeks of hospitaliza- 

tion, were gathered from clinical record fold- 

ers. Five variables were found to predict sig- 
nificantly for the initial and cross-validation 
samples. Based on the combined samples 

(N = 457), an index was devised which pre- 

dicted Short Stay (under 90 days) and Long 

Stay (91 days or more). The validity of the 

index depends on the cutting point selected. 

Prediction can be made for the entire sample 

with 77% accuracy, or for approximately one- 

half the sample with 91% accuracy. Potential 

uses of the index and the psychological im- 

plications of the significant variables were 

discussed. 
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BRIEF REPORTS 


VISUAL AFTERPHENOMENA IN DIAGNOSIS: 


DERWOOD E. JOHNSON, ROBERT W 
Evansville State Hospital, E1 


Absence or inadequacy of spiral aftereffects 
has been noted in patients with organic brain 
involvement. There is some evidence to suggest 
that functional psychiatric patients also partially 
fail in reporting apparent motion or phi phe- 
nomena. Stationary afterimages in these groups 
have not been investigated. 

In this research, spiral aftereffect and station- 
ary afterimage responses were noted in normal 
individuals, in patients with organic brain impair- 
ment, and in patients having a functional diag- 
nosis without indication of organic impairment, 
who were capable of responding to psychological 
examination. An attempt was made to relate re- 
sponse to degree of impairment or 
morbidity by classifying “severe” and “mild 
groups in both categories. A total of 100 em- 
ployees and patients was used, with subgroups of 
20 subjects. Age differences between groups were 
due to the exclusion of aged persons from the 
normal and functional groups. Since brain syn- 


extent of 


drome diagnoses are common among older pa- 
tients, these were included in the organic sample 
Nineteen records (16 of patients diagnosed as 
having chronic brain syndrome) were rejected 
because responses were doubtful 


1 An extended report of this study may be obtained 
without charge from Derwood E. Johnson, Evans 
ville State Hospital, Evansville, Indiana, or for a fee 
from the American Documentation Institute. Order 
Document, No. 5779, remitting $1.25 for microfilm 
or $1.25 for photocopies 


BAUER, anp DON R. BROWN 


ansville, Indiana 


The apparatus for the spiral aftereffect was 
similar to that used in previous research. To test 
a visual afterimage phenomenon not involving 
apparent motion, the “black and white” after- 
was used. Four cards were presented: a 
black card with white center, a white card with 
black center, and gray cards for afterimages. 

Group scores were evaluated for differences 
and Fisher’s ¢ test applied. Postdictive efficiency 
of the two tests was also calculated. 
within groups in response to the 
two tests were attributed to chance. Both tests 
yielded significantly different performance records 
when normal subjects were compared with groups 
of patients. However, neither test discriminated 
brain syndrome patients from functional patients 
at a satisfactory level, and the rate of false nega- 
tives and false positives was very high. The two 
tests were significantly correlated except in the 
severe functional group. Although the results sup- 

the conclusion that visual afterphenomena 
are deficient in mental hospital patients, they do 
not support the hypothesis that observed de- 
ficiencies are associated almost exclusively with 
organic brain impairment. 

The low postdictive efficiency obtained in this 

disabled population would imply no 
utility with borderline or equivocal cases 
nonly referred for psychological 


effect 


Differences 
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INSIGHT AS A MEASURE OF ADJUSTMENT IN THREE 
KINDS OF GROUP EXPERIENCE ' 


JOHN H 


MANN 


New York University 


anp CAROLA HONROTH MANN 
College of the City of New York 


The present study is one of a series devoted 
to the comparison of the relative effectiveness of 
group discussion, task-oriented study group ac- 


tivity, and group-centered role playing in pro- 
ducing personality and behavior change. These 
three group methods have been used in a wide 
variety of settings as a means of influencing the 
personalities of group members and were there- 
fore chosen for comparison in this study. 


Criteria for assessing personality change are 
numerous. One of the criteria recently used to 
measure personality change is insight. In opera- 
tional terms, insight is usually defined as the 
degree of between an individual’s 
view of himself and the view which others have 
of him. Insight was used as the criterion measure 
in the present study. The insight scores were de- 
rived from factor analytic studies of the observ- 
able characteristics of individuals in small groups. 
Such studies have recently been summarized by 
Carter (1954), who concludes that most of the 
variance of small group interaction can be de- 
scribed in terms of the following three factors 
Individual Prominence, Group Goal Facilitation 
and Sociability. Two variables highly loaded on 
each of these factors were selected as the rating 


congruence 


1 An extended report of this study 
without charge from John H 
School of Arts and Science, New York University, 
Washington Square, New York 3, New York, or for 
a fee from the American Documentation Institute 
Order Document No. 5776, remitting $1.25 for micro 
film or $1.25 for photocopies 


may be obtained 
Mann, Graduat¢ 


nsight scores were based 
in order to ensure the broadest coverage of inde- 
pendent content in the rating criteria chosen 

The study investigated the relative effective- 
ness of the three methods in improving 
insight of group members 
randomly selected from a 


criteria on which the 


group 
Ninety-six Ss were 
graduate course and 
assigned to 12 groups (3 discussion, 3 study, and 
6 role playing) of eight members each. Measures 
of insight were obtained near the end and at the 
beginning of each of the group experiences which 
extended over a three-week period The analysis 
of the data obtained from these measures indi 
cated that (a) the three types of group experi- 
ences did not differ in the degree to which they 
increased insight of group members; (0) all group 


members showed increase in insight over the 


course of the group experiences; and (c) no 
relationship was found between amount of in 
sight and individual adjustment. In addition, it 
was found that the role playing Ss revised their 
self estimates in the direction of group estimates 
more often than group estimates were revised in 


the direction of the individual’s self estimate 


Brief Report 
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THE EFFECT OF EGO INVOLVEMENT AND SUCCESS 
EXPERIENCE ON INTELLIGENCE TEST RESULTS? ? 


ROBERT C 


NICHOLS 


Purdue University 


Most discussions of intelligence testing proce- 
dures emphasize that rapport and ego involvement 
on the part of the S are important aspects of a 
testing situation, and that cues as to success or 
failure on parts of the test should not be given 
by the examiner. A knowledge of the probable 
effects of these variables is important for users 
of intelligence tests, since in the individual test- 
ing situation the examiner must often make al- 
lowance for lack of optimum attitudes on the 
part of the S or decide to what extent the atti- 
tude of the S has invalidated the test results. 

To obtain data relevant to this question, 11 
examiners administered the Wechsler-Bellevue 
Intelligence Scale to eight Ss following a success 
experience, which was produced by showing the 
S false norms after taking a brief figure identifi- 
cation test. The other half of the Ss were simi- 
larly given a failure experience. Half of each 
examiner’s success and failure Ss were given 
special ego-involving instructions which empha- 
sized the importance of the intelligence test they 
were about to take. The other Ss were given in- 


1The data for this study were collected by the 
graduate students in the writer’s class in intelligence 
testing at Purdue University in the spring semester, 
1956-1957. Those participating in the study were 


Irwin Monashkin, graduate assistant, Karl Beck, 
Paul Carlson, George Cerbus, Eugene Ebner, Gerald 
Engle, Judith Flannigan, Shirley Karas, Harrison 
McKay, Ray Mentzer, Deodandus Striimpfer, and 
Richard Wich. 

2 An extended report of this study may be obtained 
without charge from Robert C. Nichols, Department 
of Psychology, Purdue University, Lafayette, Indi- 
ana, or for a fee from the American Documentation 
Institute. Order Document No. 5777, remitting $1.25 
for microfilm or $1.25 for photocopies 


structions designed to minimize ego involvement 
The 88 Ss who volunteered for the study were 
male undergraduate engineering students ranging 
in age from 18 to 32 and ranging in full scale 
IQ from 99 to 140. The mean IQ for the group 
was 124. 

Considering success, ego involvement, and ex- 
aminer as experimental variables, the data were 
analyzed according to a 2 X 2 X 11 analysis of 
variance design. Separate analyses were done for 
the verbal and performance scales, and for digit 
span and information plus comprehension sub- 
tests. The results are clearly negative. In the 28 
F tests computed, no significant effects were 
found. 

These negative findings support the assumption 
that differences in test taking attitude on the 
part of the S and minor differences in testing 
procedure on the part of the examiner do not 
materially affect intelligence test scores. How- 
ever, since the Ss used in this study were all 
intelligent students who are used to taking tests 
and doing their best, the results may not be 
directly applicable to clinic and hospital groups 
who may have less motivation to excel on intel- 
lectual tasks 

Since the 11 examiners used in this study had 
limited experience with the test, the lack of sig- 
nificant differences between examiners is espe- 
cially significant. With normal intelligent Ss, as 
used in this study, individual intelligence test 
scores are independent of special motivating or 
distracting aspects of the testing situation over 
a fairly wide range of conditions 
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This timely work describes the method of science as applied to the study of psychological 
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ducting a scientific study of a psychological phenomenon. In the latter part of the book 
specific procedures and applications of the scientific methods in psychology are described 
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individual differences, rather than on stimulus-response psychology of individual behav 
ior. The approach is from the standpoint of the logic of experimental method. Many 
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model of personality based upon factor theory as a basis for the assessment and descrip 
tion of persons. Included are chapters describing personality in terms of dimensions 
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pathology, as well as of temperament, needs, interests, and aptitudes. 
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SOME PATIENTS 


WHO DONT MAKE ANY PROGRESS? 


Their basic personality structure may he incompatible with that 
of the therapist. Consult the new projective test. 








Perceptual Personality Test (explained in the J. of Psychol., No.’s 45 
and 46, 1958) by Dr. Honkavaara (research associate in the Department of Psychology 
et Brandeis University, formerly at Harvard and London Universities). 


New method to reveal the basic structure of personality. 


The handbook contains main principles af the new psychotherapy. 








Price $10. (The profit of the test is being used to further the research). 
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